"Please Name the First Two People you Would Ask for Help": The Effect of Limitation of the Number of Alters on Network Composition Tina Kogovšek1, Maja Mrzel2, and Valentina Hlebec3 Abstract Social network indicators (e.g., network size, network structure and network composition) and the quality of their measurement may be affected by different factors such as measurement method, type of social support, limitation of the number of alters, context of the questionnaire, question wording, personal characteristics of respondents such as age, gender or personality traits and others. In this paper we focus on the effect of limiting the number of alters on network composition indicators (e.g., percentage of kin, friends etc.), which are often used in substantive studies on social support, epidemiological studies and so on. Often social networks are only one among many topics measured in such large studies; therefore, limitation of the number of alters that can be named is often used directly (e.g., International Social Survey Programme) or indirectly (e.g., General Social Survey) in the network items. The analysis was done on two comparable data sets from different years. Data were collected by the name generator approach by students of the University of Ljubljana as part of various social science methodology courses. Network composition on the basis of direct use (i.e., already in the question wording) of limitation on the number of alters is compared to network composition on full network data (i.e.,collected without any limitations). 1 University of Ljubljana, Faculty of Arts, Aškerčeva 2/Faculty of Social Sciences, Kardeljeva ploščad 5, SI-1000 Ljubljana, Slovenia University of Ljubljana, Faculty of Social Sciences, Kardeljeva ploščad 5, SI-1000 Ljubljana, Slovenia University of Ljubljana, Faculty of Social Sciences, Kardeljeva ploščad 5, SI-1000 Ljubljana, Slovenia 1 Introduction Social network indicators (e.g., network size, network structure and network composition) and the quality of their measurement may be affected by different factors such as measurement method (e.g., Ferligoj and Hlebec, 1999; Kogovšek and Ferligoj, 2005; Kogovšek, 2006), type of social support (Ferligoj and Hlebec, 1999; Kogovšek and Ferligoj, 2004), limitation of the number of alters (e.g., Holland and Leinhardt, 1973; van Groenou et al, 1990; Hlebec and Kogovšek, 2005; Kogovšek and Hlebec, 2008), considering close or extended network (e.g., Morgan et al., 1997; Kogovšek and Ferligoj, 2004), context of the questionnaire (e.g., Bailey and Marsden, 1999), question wording (e.g, Straits, 2000), personal characteristics of respondents such as age, gender or personality traits (e.g., Kogovšek and Ferligoj, 2005) and other factors. In this paper we focus on the effect of limitation of the number of alters on network composition indicators (e.g., percentage of kin, friends etc.), which are often used in substantive studies on social support as well as in large studies of a more general kind, such as the General Social Survey. Previous studies (e.g., Holland and Leinhardt, 1973; van Groenou et al, 1990) have shown that limiting the number of alters may lead to differences in network size, composition and structure as well as data quality. In most studies that explicitely used such a limitation, it usually ranges between three and eight alters. Respondents may use different strategies for naming their alters depending on whether the limitation is put to them or not. Any information in the question wording may be an element that the respondent uses in formulating his/her response (e.g., Hippler et al., 1987; Sudman et al., 1996). Therefore, despite a potentially larger survey, the no limit condition regarding the number of named alters is usually advised (van der Poel, 1993). However, network measurement items are often only part of larger survey instruments (e.g., International Social Survey Programme, General Social Survey, or Generations and Gender Programme), where limitations as to the number of alters are often used and even necessary for reasons of economy, reduced respondent burden, etc. The limitation may be direct (e.g., International Social Survey Programme), by which we mean that the limitation is put to the respondent directly within the question itself (e.g., please name up to five people with whom you socialize regularly). On the other hand, the limitation may be indirect (e.g., General Social Survey), which means the respondent is not aware of the limitation (it is not explicit in the question itself), but detailed data are collected by the interviewer only for the first few alters later on (e.g., Burt, 1984).4 4 Another possibility for an indirect limitation may arise in the phase of analysis. For different reasons (e.g., issues of comparability owing to the use of different data sets), a researcher may limit analysis to the first n named alters (e.g., Kogovšek and Hlebec, 2005). Several recent studies (e.g., Hlebec and Kogovšek, 2005; Kogovšek and Hlebec, 2008) have shown that network composition, obtained by the limit/no limit condition is to some extent comparable. However, there were a number of other methodological differences (e.g., question wording, approach to network members collection) in the instruments used, which could confound the effect of limitation in the number of alters with the effect of other factors. Therefore, a methodological experiment was done using identical measurement approaches, but one with and one without the limitation regarding the number of alters. The analysis was done on two comparable data sets from two different years. In both cases data were collected by the name generator approach by students of the University of Ljubljana as part of various social science methodology courses. Network composition on the basis of direct use (i.e., already in the question wording) of limitation in the number of alters is compared to the composition based on full network data (collected without limitations). 2 Research design and data Three types of social support were measured with six network generators: 1. Some tasks in the apartment or in the garden a person cannot do by him/herself. It may happen that you need someone to hold the ladder for you or help you move the furniture. Whom would you ask for help first? Whom would you ask for help as the second? (instrumental support) 2. Say you have the flu and have to lie down for a few days. You would need help with various household tasks, such as shopping and similar. Whom would you ask for help first? Whom would you ask for help as the second? (instrumental support) 3. Now imagine you needed to borrow a larger sum of money. Whom would you ask for help first? Whom would you ask for help as the second? (instrumental support) 4. Say you have problems in the relationship with your husband/wife/partner which you cannot solve on your own. Whom would you ask for help first? Whom would you ask for help as the second? Even if you are not married and do not have a partner, try to answer what you would do in such a case. (emotional support) 5. What about the case when you felt a little blue or depressed and would like to talk to someone about it. Whom would you ask for help first? Whom would you ask for help as the second? (emotional support) 6. Say you needed advice with regard to an important life decision, for instance getting a job or moving to another place. Whom would you ask for help first? Whom would you ask for help as the second? (informational support).5 Data was collected within the academic process during various courses on social science methodology, first in 2006 (Faculty of Social Sciences) and then again in 2008 (Faculty of Social Sciences and Faculty of Arts). In both instances quota samples from the general population defined by gender and three age groups were used.6 In 2006 there was no limitation regarding the number of alters. Data was collected by two different approaches: the name generator and the role relation approaches were used once in the first and once in the second wave (the waves were two weeks apart).7 In 2008 data was collected only once and by only the name generator approach, with direct limitation (in the question wording itself) to the first two alters.8 Table 1: Information on data sets. Year 2006 2008 Limitation - N. of alters no limitation first 2 alters Method Name generator Name generator N 232 331 Sample Quota (gender, age) Quota (gender, age) Collected by Students of FSS Students of FSS and FA Although the data was collected in two different years, we assume that data obtained by name generators are comparable for methodological tests, since the same type of sample was used and the wordings of the name generators were identical. Firstly, some parts of the data sets had to be harmonized, since there were slight differences in question wording between the two sets (marital status, type of community, education and relation to ego). Because there was a limitation to the first two alters in 2008, network composition indicators used in further analyses are only an approximation. The network composition obtained in both years was compared and tested with three methods that are presented in the following sections. First, a t-test was done 5 Here the 2008 version of the question wording (with limitation to the first two alters) is presented. In 2008 workplace support was also measured, but not in 2006; therefore, it is omitted from the analyses in this paper. 6 Panel design was not used. In each year a different set of respondents was used.There are no statistically significant differences between our samples regarding gender, age, education and marital status. The role relationship approach is beyond the scope of the present study and is therefore not considered here. 8 Another interesting possibility would be to compare the indirect limitation in the 2006 data (i.e., using data on only the first two named alters) to the other two conditions. Unfortunately, in 2006, no data was collected about the rank ordering of named alters; therefore, extraction of the first two named persons was impossible. to find significant differences in network composition assessed with the limit/nolimit condition. Second, multiple regression was calculated with network composition indicators as dependent variables and several independent variables including the questionnaire design and demographic characteristics of respondents (the limit/no limit condition, gender, age, education and marital status). Third, the MCA9 was obtained to assess the effects of limitation in the number of alters, type of social support and strength of tie on network composition. 3 Results 3.1 T-test Firstly, independent samples t-tests were done for network composition indicators as dependent variables (e.g., % of partner, mother and father) with presence/absence in limitation of the number of alters as the independent variable. T-tests were done for the overall network and for each type of support separately. Altogether 84 t-tests were done. Mean differences (2006 - 2008) are shown in Table 2. Statistically significant (at 5% level) differences are shaded in grey. Table 2: Mean differences (statistically sig. in grey). All ntw. Hshold. Illness Money Partner Depress. Advice Partner -4,2 -,2 2,7 1,4 2,3 6,6 4,3 Mother -,3 -,9 -,7 2,1 5,3 4,2 -,9 Father -,4 -2,8 -,9 3,6 ,1 ,7 -,7 Daughter 1,0 2,8 -,5 ,6 -,8 1,2 ,6 Son 2,3 ,2 ,9 1,7 1,6 ,9 ,8 Sister -1,2 ,5 -,9 -1,3 -4,8 -2,1 -1,5 Brother -,1 ,2 ,9 -3,3 ,2 ,4 -,4 Other kin* 2,6 ,9 -,7 -,8 -3,8 -3,0 ,5 Friend 3,6 5,1 4,6 ,3 4,8 -1,7 ,8 Neighbor ,7 -1,5 -,8 -,2 ,1 ,4 ,1 Co-worker ,3 -,1 ,0 ,5 -,6 ,1 -,2 Other* 1,1 -,4 -,8 2,2 2,8 -,7 ,8 * The categories grandfather, grandmother, grandson, granddaughter, other kin from my family and other kin from my partner's family from the original question wording were collapsed into the category "other kin" in the analyses. The category "other" represents all other types of relations besides those specifically listed and was used as such in the original question wording and in the analyses later on. Within the whole network, partner had significantly less importance (lower percentage) in the no limit condition, whereas son, other kin, friend and other had 9 Multiple Classification Analysis, see a more detailed explanation in Section 3.3. significantly more (higher percentage) importance. Within instrumental support in the no limit condition, friends had significantly greater importance (help in the household, illness), while brother had significantly less importance (borrow a larger sum of money). Statistically significant differences appear most commonly within emotional support. In the case of problems with a partner, mother and other had greater importance in the no limit condition, sister and other kin had greater importance in the limit condition. In the case of depression, other kin again had greater importance in the limit condition, whereas partner and mother had greater importance in the no limit condition. For informational support, there were no significant differences between conditions. 3.2 Multiple regression analysis The next step in our analysis was to determine whether differences in means really depend to a great extent on using limitation in name generators or if the effects of other factors, such as demographic variables (e.g., gender, age, education or marital status), are greater. Therefore, OLS regression analyses were done for network composition indicators as dependent variables and with the presence/absence of limitation of the number of alters and demographic variables as control variables (gender, age, education and marital status) as independent variables. As with the t-tests, analyses were done for the overall network and for each type of support separately. Altogether 84 regression analyses were performed. Table 3: Strength of effects (statistically sig. effect of the limitation in grey). All ntw. Hshold. Illness Money Partner Depress. Advice Partner Some ctrls All ctrls All ctrls Some ctrls Some ctrls Some ctrls Some ctrls Mother All ctrls All ctrls All ctrls Some ctrls Some ctrls Some ctrls Some ctrls Father All ctrls Some ctrls All ctrls Some ctrls Some ctrls Some ctrls All ctrls Daughter All ctrls Some ctrls Some ctrls All ctrls All ctrls Some ctrls Some ctrls Son Some ctrls All ctrls Some ctrls Some ctrls Some ctrls Some ctrls Some ctrls Sister Some ctrls Some ctrls Some ctrls Some ctrls Some ctrls Some ctrls Some ctrls Brother All ctrls Some ctrls Some ctrls Some ctrls All ctrls Some ctrls All ctrls Other kin* Some ctrls All ctrls All ctrls All ctrls Some ctrls Some ctrls Some ctrls Friend Some ctrls Some ctrls Some ctrls All ctrls Some ctrls All ctrls Some ctrls Neighbor Some ctrls Some ctrls Some ctrls Some ctrls Some ctrls Some ctrls All ctrls Co-worker Some ctrls Some ctrls All ctrls Some ctrls All ctrls All ctrls Some ctrls Other* Limit Some ctrls Some ctrls Some ctrls Limit Some ctrls Some ctrls * The categories grandfather, grandmother, grandson, granddaughter, other kin from my family and other kin from my partner's family from the original question wording were collapsed into the category "other kin" in the analyses. The category "other" represents all other types of relations besides those specifically listed and was used as such in the original question wording and in the analyses later on. The results are summarized in Table 3. Three typical situations appear in the table. "All ctrls" means that all control variables had stronger effects10 than the limit/no limit condition, "some controls" means that some control variables had a stronger effect than the limit/no limit condition. Where the effect of the limit/no limit condition was the strongest, "limit" appears in the table. If Table 3 is compared to Table 2, it can immediately be seen that statistically significant effects (shown in grey) appear practically at the same positions in the tables. Where the effect of the limitation is statistically significant, the standardized beta coefficients vary between -.063 and -.150 on the negative side and between .069 and .175 on the positive side. In five of these cases the betas are significant at the 1% level (dark grey), and in eleven cases they are significant at the 5% level (light gray).11 In most cases, neither the effects of control variables nor the effect of the limit/no limit condition is statistically significant. It can also be seen that the effect of the limit/no limit condition is stronger than the effect of all control variables in only two out of 84 analyses. It may be somewhat counterintuitive that the category "other" should emerge as statistically significant. However, this result may be precisely an indicator of the effect of using a limitation in the question wording. Thus, a larger percentage of "other" is obtained under the no limit condition than under the limit condition. If a respondent is allowed to freely name as many persons as he/she wants, it is more likely that a weak tie may appear on the list (e.g., a psychotherapist) in comparison to limiting him/herself to the two most important persons, where strong ties are more likely to be named (e.g., Burt, 1986, see also Discussion section in this paper). Until this point the analyses were done on the individual level, i.e. the units of the analysis were the actual respondents in the survey. In the next section we proceed with analyses on a higher, aggregated level, in the sense that the units of analysis are no longer individual respondents, but network composition indicators. Therefore, we are performing a kind of meta-analysis on the variables with the aim of testing the effects of some factors that we could not test on the level of individual respondents. 10 Comparison of the strength of effects was done on the basis of comparison of the absolute value of standardized beta coefficients in each model. 11 Since we are dealing with multiple comparisons, also global risk was considered. If 5% global risk is taken as an acceptable limit, individual risk can be calculated, for instance, as the Bonferroni correction (Bonferroni, 1936) - global risk divided by the number of tests: 5/84=0.05%. Therefore, the effect of the limit can be considered as statistically significant in cases, where the significance level does not exceed 0.001, which means in 5 out of 84 cases. 3.3 Multiple classification analysis The third part of our analysis was done by Multiple Classification Analysis (Andrews et al., 1973). It is a multivariate method, where relationships between multiple independent variables (predictors) and a dependent variable are analyzed. It is similar to multiple regression, with the advantage that nominal measurement level variables do not need to be dichotomized. MCA gives us the following information: - the grand (total) mean and group means of the dependent variable for each combination of categories of predictors; - tests of significance of the effects of single predictors; - ßs: the effect of each predictor (with other predictors held constant); - deviations from the grand mean of the dependent variable for each category of a predictor and - R : the percentage of explained variance for all predictors. Firstly, the means of network composition indicators (% of partner, friends etc.) were estimated separately for each support type and for each of the two conditions (with/without limit) - this was our dependent variable. In the MCA analysis it was then tested, with which predictors and to what extent these differences could be explained. Explanatory variables in the analysis were as follows: - limitation of the number of alters (none or first two), - type of social support (instrumental, emotional or informational), - strength of tie (strong (partner, friend or close kin), weak (other kin, neighbor, co-worker or other)).12 There are several different ways in which the strength of a tie can be defined and assessed. For instance, it can be defined as multiplexity - a tie is strong if it contains many different kinds of interactions (e.g., exchanges of different types of social support) between the respondent and an alter (e.g., Wellman and Wortley, 1990). Strong ties are usually named first, since they are more salient and therefore more quickly retrievable from memory (e.g., Burt, 1986; Brewer, 1993, 1995; Brewer and Yang, 1994). The closest ties can also be defined by the type of relationship between the respondent and the alter (e.g., Wellman et al., 1988/1997; Wellman and Wortley, 1990). The closest network usually consists of the partner, close kin (parents, children and siblings) and friends, whereas the extended network contains extended kin, coworkers, neighbors and so on. The latter definition of a strong tie is the one used in this paper. 12 Mother, father, son, daughter, brother and sister were collapsed into the category "close kin". Grandmother, grandfather, granddaughter, grandson, other kin from my family and other kin from my partner's family were collapsed into the category "other kin". The results are presented in Table 4. It can be seen that only strength of tie has a statistically significant effect on the mean of network composition indicators. As already shown in previous studies (e.g., Hlebec and Kogovšek, 2005; 2009, Kogovšek and Hlebec, 2005), differences in means are larger for strong ties and lower for weak ties. Differences as a result of the effect of the limitation and support type are weak and rather small. Altogether 17.5% is explained by all predictors in the model. Table 4: Multiple classification analysis. Network composition Grand mean = 8.10 N multivariate sig. level b deviation LIMIT Without 72 .23 With 72 .026 -.23 STRENGTH OF TIE *** Strong 96 2.59 Weak 48 .418 -5.19 SUPPORT TYPE Instrumental 72 .03 Emotional 48 -.07 Informational 24 .006 .05 Multiple R2 .175 * . 10