Problem:

The data set soep2000.csv contains, among other variables, the time until the end of unemployment (dauer) as well as a dummy variable for females (female). Use R to read the data and to solve the following tasks:

  1. Estimate separately for each gender the survivor functions for the duration of unemployment (in months). Interpret the estimates and compare the two curves.

  2. For both genders, obtain the median time until end of unemployment and the respective 95% confidence intervals.

  3. Compute a log-rank test for the comparison of the two groups.

Note: Here experiencing an event means getting a job

head(soep2000)
  1. Estimating using Kaplan-Meier:
library(survival)
km<-survfit(Surv(dauer,status)~ female,data= soep2000)
str(km)
List of 17
 $ n        : int [1:2] 1206 794
 $ time     : num [1:72] 1 2 3 4 5 6 7 8 9 10 ...
 $ n.risk   : num [1:72] 1206 1036 901 785 680 ...
 $ n.event  : num [1:72] 138 109 92 81 38 48 38 24 21 15 ...
 $ n.censor : num [1:72] 32 26 24 24 23 18 17 16 17 16 ...
 $ surv     : num [1:72] 0.886 0.792 0.711 0.638 0.602 ...
 $ std.err  : num [1:72] 0.0104 0.0149 0.0186 0.0222 0.0241 ...
 $ cumhaz   : num [1:72] 0.114 0.22 0.322 0.425 0.481 ...
 $ std.chaz : num [1:72] 0.00974 0.01402 0.0176 0.02101 0.02288 ...
 $ strata   : Named int [1:2] 36 36
  ..- attr(*, "names")= chr [1:2] "female=0" "female=1"
 $ type     : chr "right"
 $ logse    : logi TRUE
 $ conf.int : num 0.95
 $ conf.type: chr "log"
 $ lower    : num [1:72] 0.868 0.77 0.686 0.611 0.575 ...
 $ upper    : num [1:72] 0.904 0.816 0.738 0.666 0.632 ...
 $ call     : language survfit(formula = Surv(dauer, status) ~ female, data = soep2000)
 - attr(*, "class")= chr "survfit"
library(GGally)
ggsurv(km)+
  geom_hline(yintercept = 0.5, linetype = 3)+
  geom_vline(xintercept=8,linetype=3)+
  geom_vline(xintercept=14,linetype=3)+
  labs(x="Time",y="S(t)",title = "Duration between the gender of unemployment")+
  ylim(c(0,1))+
  theme_bw()

summary_km<-(summary(km,rmean = "none"))$table
summary_km
         records n.max n.start events median 0.95LCL 0.95UCL
female=0    1206  1206    1206    726      8       7      10
female=1     794   794     794    396     14      12      16

Analysis: After 8 months, 50% of the unemployed males (female=0) found a job After 14 months, 50% of the unemployed females (female=1) found a job Probability to find a job is higher for males compared to females

  1. 95% CI and Median:
quantile_25<-as.data.frame(quantile(km,.25))
colnames(quantile_25)<-c("estimate","lower band","upper band")

quantile_50<-as.data.frame(quantile(km,.50))
colnames(quantile_50)<-c("estimate","lower band","upper band")

#another method:
#
#q25<-do.call(cbind.data.frame,quantile(km,.25))
#

quantile_25
quantile_50
NA

After 3 Months, 25% of the unemployed males found a job After 5 months, 25% of the unemployed females found a job

  1. Log-rank test:
logrank<-survdiff(Surv(dauer,status)~ female,data= soep2000)
logrank
Call:
survdiff(formula = Surv(dauer, status) ~ female, data = soep2000)

            N Observed Expected (O-E)^2/E (O-E)^2/V
female=0 1206      726      651      8.62      22.1
female=1  794      396      471     11.92      22.1

 Chisq= 22.1  on 1 degrees of freedom, p= 3e-06 
Wilcoxon<-survdiff(Surv(dauer,status)~ female,data= soep2000,rho=1)
Wilcoxon
Call:
survdiff(formula = Surv(dauer, status) ~ female, data = soep2000, 
    rho = 1)

            N Observed Expected (O-E)^2/E (O-E)^2/V
female=0 1206      536      473      8.31        28
female=1  794      273      336     11.69        28

 Chisq= 28  on 1 degrees of freedom, p= 1e-07 

We can see that p<0.05,0.01, so we can reject that Null hypothesis stating that survival function and hazard rate of both the group is different.

Conclusion: By campring both gender it’s proven that males has higher chances of getting the job back and the unemployment rate is very less when compared to females.

LS0tDQp0aXRsZTogIlVuZW1wbG95ZW1lbnQgYmV0d2VlbiBHZW5kZXJzIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NClByb2JsZW06DQoNClRoZSBkYXRhIHNldCBzb2VwMjAwMC5jc3YgY29udGFpbnMsIGFtb25nIG90aGVyIHZhcmlhYmxlcywgdGhlIHRpbWUgdW50aWwgdGhlIGVuZCBvZiB1bmVtcGxveW1lbnQgKGRhdWVyKSBhcyB3ZWxsIGFzIGEgZHVtbXkgdmFyaWFibGUgZm9yIGZlbWFsZXMgKGZlbWFsZSkuIFVzZSBSIHRvIHJlYWQgdGhlIGRhdGEgYW5kIHRvIHNvbHZlIHRoZSBmb2xsb3dpbmcgdGFza3M6DQoNCg0KaS4gRXN0aW1hdGUgc2VwYXJhdGVseSBmb3IgZWFjaCBnZW5kZXIgdGhlIHN1cnZpdm9yIGZ1bmN0aW9ucyBmb3IgdGhlIGR1cmF0aW9uIG9mIHVuZW1wbG95bWVudCAoaW4gbW9udGhzKS4gSW50ZXJwcmV0IHRoZSBlc3RpbWF0ZXMgYW5kIGNvbXBhcmUgdGhlIHR3byBjdXJ2ZXMuDQoNCmlpLiBGb3IgYm90aCBnZW5kZXJzLCBvYnRhaW4gdGhlIG1lZGlhbiB0aW1lIHVudGlsIGVuZCBvZiB1bmVtcGxveW1lbnQgYW5kIHRoZSByZXNwZWN0aXZlIDk1JSBjb25maWRlbmNlIGludGVydmFscy4NCg0KaWlpLiBDb21wdXRlIGEgbG9nLXJhbmsgdGVzdCBmb3IgdGhlIGNvbXBhcmlzb24gb2YgdGhlIHR3byBncm91cHMuDQoNCg0KDQpOb3RlOiBIZXJlIGV4cGVyaWVuY2luZyBhbiBldmVudCBtZWFucyBnZXR0aW5nIGEgam9iDQoNCmBgYHtyfQ0KaGVhZChzb2VwMjAwMCkNCmBgYA0KMS4gRXN0aW1hdGluZyB1c2luZyBLYXBsYW4tTWVpZXI6DQpgYGB7cn0NCmxpYnJhcnkoc3Vydml2YWwpDQprbTwtc3VydmZpdChTdXJ2KGRhdWVyLHN0YXR1cyl+IGZlbWFsZSxkYXRhPSBzb2VwMjAwMCkNCnN0cihrbSkNCg0KYGBgDQoNCmBgYHtyfQ0KbGlicmFyeShHR2FsbHkpDQpnZ3N1cnYoa20pKw0KICBnZW9tX2hsaW5lKHlpbnRlcmNlcHQgPSAwLjUsIGxpbmV0eXBlID0gMykrDQogIGdlb21fdmxpbmUoeGludGVyY2VwdD04LGxpbmV0eXBlPTMpKw0KICBnZW9tX3ZsaW5lKHhpbnRlcmNlcHQ9MTQsbGluZXR5cGU9MykrDQogIGxhYnMoeD0iVGltZSIseT0iUyh0KSIsdGl0bGUgPSAiRHVyYXRpb24gYmV0d2VlbiB0aGUgZ2VuZGVyIG9mIHVuZW1wbG95bWVudCIpKw0KICB5bGltKGMoMCwxKSkrDQogIHRoZW1lX2J3KCkNCmBgYA0KDQpgYGB7cn0NCnN1bW1hcnlfa208LShzdW1tYXJ5KGttLHJtZWFuID0gIm5vbmUiKSkkdGFibGUNCnN1bW1hcnlfa20NCmBgYA0KQW5hbHlzaXM6DQpBZnRlciA4IG1vbnRocywgNTAlIG9mIHRoZSB1bmVtcGxveWVkIG1hbGVzIChmZW1hbGU9MCkgZm91bmQgYSBqb2INCkFmdGVyIDE0IG1vbnRocywgNTAlIG9mIHRoZSB1bmVtcGxveWVkIGZlbWFsZXMgKGZlbWFsZT0xKSBmb3VuZCBhIGpvYg0KUHJvYmFiaWxpdHkgdG8gZmluZCBhIGpvYiBpcyBoaWdoZXIgZm9yIG1hbGVzIGNvbXBhcmVkIHRvIGZlbWFsZXMNCg0KMi4gOTUlIENJIGFuZCBNZWRpYW46DQpgYGB7cn0NCnF1YW50aWxlXzI1PC1hcy5kYXRhLmZyYW1lKHF1YW50aWxlKGttLC4yNSkpDQpjb2xuYW1lcyhxdWFudGlsZV8yNSk8LWMoImVzdGltYXRlIiwibG93ZXIgYmFuZCIsInVwcGVyIGJhbmQiKQ0KDQpxdWFudGlsZV81MDwtYXMuZGF0YS5mcmFtZShxdWFudGlsZShrbSwuNTApKQ0KY29sbmFtZXMocXVhbnRpbGVfNTApPC1jKCJlc3RpbWF0ZSIsImxvd2VyIGJhbmQiLCJ1cHBlciBiYW5kIikNCg0KI2Fub3RoZXIgbWV0aG9kOg0KIw0KI3EyNTwtZG8uY2FsbChjYmluZC5kYXRhLmZyYW1lLHF1YW50aWxlKGttLC4yNSkpDQojDQoNCnF1YW50aWxlXzI1DQpxdWFudGlsZV81MA0KDQpgYGANCkFmdGVyIDMgTW9udGhzLCAyNSUgb2YgdGhlIHVuZW1wbG95ZWQgbWFsZXMgZm91bmQgYSBqb2INCkFmdGVyIDUgbW9udGhzLCAyNSUgb2YgdGhlIHVuZW1wbG95ZWQgZmVtYWxlcyBmb3VuZCBhIGpvYg0KDQozLiBMb2ctcmFuayB0ZXN0Og0KYGBge3J9DQpsb2dyYW5rPC1zdXJ2ZGlmZihTdXJ2KGRhdWVyLHN0YXR1cyl+IGZlbWFsZSxkYXRhPSBzb2VwMjAwMCkNCmxvZ3JhbmsNCg0KV2lsY294b248LXN1cnZkaWZmKFN1cnYoZGF1ZXIsc3RhdHVzKX4gZmVtYWxlLGRhdGE9IHNvZXAyMDAwLHJobz0xKQ0KV2lsY294b24NCmBgYA0KV2UgY2FuIHNlZSB0aGF0IHA8MC4wNSwwLjAxLCBzbyB3ZSBjYW4gcmVqZWN0IHRoYXQgTnVsbCBoeXBvdGhlc2lzIHN0YXRpbmcgdGhhdCBzdXJ2aXZhbCBmdW5jdGlvbiBhbmQgaGF6YXJkIHJhdGUgb2YgYm90aCB0aGUgZ3JvdXAgaXMgZGlmZmVyZW50Lg0KDQpDb25jbHVzaW9uOg0KQnkgY2FtcHJpbmcgYm90aCBnZW5kZXIgaXQncyBwcm92ZW4gdGhhdCBtYWxlcyBoYXMgaGlnaGVyIGNoYW5jZXMgb2YgZ2V0dGluZyB0aGUgam9iIGJhY2sgYW5kIHRoZSB1bmVtcGxveW1lbnQgcmF0ZSBpcyB2ZXJ5IGxlc3Mgd2hlbiBjb21wYXJlZCB0byBmZW1hbGVzLg0KDQogIA==