Supplemental Information

For the data characterizing traditional gender institutions, this document (1) describes in detail each variable in the final dataset and (2) provides supplemental information on data collection and processing methodology.

Description of Variables in Dataset

Primary Variables

  • Intermediate or Local Political Leadership The primary focus of this study to assess the level of access that contemporary women have to political leadership. Drawn from the Standard Cross Cultural Survey, this variable characterizes the gender distribution amongst local or intermediate political leaders. We expansively define a local or political leader as an individual who has authority or significant influence over members of the same ethnic group beyond a single familial lineage. Such leaders may include village paramount chiefs, monarchs, village headmen and women, or other individuals with formalized influence such as “Queen mothers”.
  • Solidarity and Socialization Groups We are also interested in the presence of social groups that serve to empower or instruct members within each society. We expand upon the definition of a female solidarity group provided by Sanday (1981), creating a symmetric definition for men: “Female [male] solidarity groups are present that: (1) are unofficial groups formed by females [males] to help one another and to discuss matters of extra-domestic importance or (2) are official groups recognized in society whose function is to consolidate female [male] economics, social, or political power.” To this definition, we also add any groups which exists to provide socialization or cultural education for adolescent girls and boys. Therefore this definition includes gender-restricted groups such as secret societies, clubs, and formalized mutual aid agreements. It is important to note that by this definition, a group must be durable and active in a member’s life to be considered a solidarity or socialization group. Nominal associations such as simple age-sets or transient initiation rituals that impart no mutual benefit, empowerment, or cultural instruction for its members are not considered to be solidarity or socialization group in this dataset. We code the presence or absence1.
  • Kinship or Familial Leadership Distinct from intermediate political leaders, we encode the traditional gender distribution for leaders of familial lineages, clans, and other kinship groups. We define a familial or kinship leader as an individual who has holds some form of principal authority or control members of their familial lineage beyond the simple nuclear household2. For example, the Serer people designate an elder male of a particular matrilineage as Tookor, a prominent member of the family with control over familial assets and marriage arrangements for nephews. The Serer also customarily regard the mother or sister of the king as Lingeer, a “queen-mother” with significant social and economic influence. Given that the Lingeer is a matrilineally inherited position, we would consider Serer familial leadership positions to be predominantly male with female participation to a lesser extent.

Secondary Variables

  • Division of Labor in Agricultural Production The data include a series of variables that seek to characterize the gender distribution within various agricultural tasks. Specifically, we encode participation by gender in land clearance, soil preparation, planting, crop tending, harvesting, the care of domestic animals (small and large), and milking. We define participation in each task based on the average share of labor and effort each gender has historically contributed in pre-colonial times.
  • Names of Crops Grown We record crops grown by each ethnic group as a whole. Additionally, many societies feature crops that are customarily the domain of one gender. Where this information is available, we decompose this set of crops into those that are said to be “grown by” women and those “grown by” men. Note that if a society features only women in agricultural production but does not explicitly mention that these crops are grown by women, we only list crops grown by the group as a whole3
  • Deities and Mythical Founders of the Culture We are interested in gender representation within the ethnic group’s worldview. Specifically, we encode the representation of male and female figures in (1) cultural origin or founding legendarium4 and (2) any recognized deities for the ethnic group. This information was coded on the basis of literary mentions of gender in reference to these two subjects. For example, if the only gods mentioned in literary sources for the group are male, we encode the deities variable as 1, indicating that males figure exclusively. The same logic is used for the variable on mythical founders.
  • Collective Fighting and Warfare We encode the gender representation for any combat activities undertaken by the ethnic group. We define collective fighting and warfare as organized military or defensive operations.
  • Control over Fruits of Labor We encode the control over the distribution of any resources produced by household labor on the basis of gender. We also separately encode control over fruits of labor produced solely by females and males. Note that this question is distinct from control over non-moveable assets or property.
  • Community Gatherings and Small Councils We encode participation in gatherings such as community meetings or small councils by gender. For the purposes of this study, we are interested in any type of community gathering in which members come together and discuss or make decisions over tribal affairs.

Data Processing and Finalization

Upon completion of stages, raw data from the literature and survey-based collection phases was then reviewed systematically. The purpose of the final data review is to (1) correct any simple coding errors or anomalies, (2) verify consistency in the coding process, and (3) reconcile data collected from the literature and survey-based sources.

Review of Literature-based Data

The set of systematic corrections applied to all literature-based ethnographic codes are as follows:

  • When agricultural cultivation or livestock are clearly present but no explicit mentions of the actual gender divisions are present, we set the relevant agricultural labor division code to 0, which indicates that the activity is clearly present but the gender division is not discussed in any available sources.
  • There are a subset of cases in which our research assistants found that although a characteristic was discussed in the literature, the division by gender is not discussed. The RAs were instructed to flag these cases as ambiguous. As part of the coding review, all such codes where the ambiguity comes solely from the absence of a clear gender distinction were recoded to 0, which indicates that the activity is clearly present but the gender division is not discussed in any available sources.
  • We attempted to resolve any remaining variables coded as ambiguous based on the citation used by the research assistant. If the variable was coded as ambiguous but the citation clearly indicated an existing code, the variable was corrected to reflect an appropriate existing code. Ambiguities arising from conflicting literary sources or other discrepancies were left coded as ambiguous.
  • There are a subset of cases in which the society customarily has women and men participating in agricultural labor in differentiated but roughly equal ways. For example, there are many groups in which women and men farm different crops. We addressed this phenomenon by using a set of simple recoding rules:
    • If the group features women and men farming their own crops, recode the variable to reflect who contributes the larger share of labor to aggregate agricultural production.
    • If the group features women and men farming their own crops and those crop varieties are explicitly mentioned in the literary source, we note the varieties grown by men, varieties grown my women, and varieties grown by the group as a whole in separate variables.
    • If the group features men and women are said to both participate but the actual labor division is not explicitly mentioned, we recode to each pertinent agricultural labor task to 3, which indicates equal or near equal participation by gender.

Merging Overlapping Codes

Finally, since our survey-based subsample of groups included a random subsample of groups already covered by the literature-based phase, 27 groups had overlapping ethnographic codes from both the literature-based encoding and the survey responses. We develop a process to merge overlapping survey responses with the literature-based codes. It is important to note that this process strives to keep the dataset consistent with itself. Our reconciliation procedure is as follows:

General

  • If a variable was missing from the literature-based coding, we used the survey-based code, and vice versa.
  • If the literature-based code indicated that the activity or characteristic was present in society but there no clear indication of gender division is mentioned, we chose the most frequently occurring code from the survey responses.
  • When the codes conflicted, we used the note provided by the literature-based team to reconcile the difference. Evidence from the primary literature, given in the note used by the research assistant to justify their code, was used to generate the final code.
    • Cases in which the note associated with the literature based code indicates that the value was coded, we preferred the most frequently occurring survey response.
  • If the code conflict could not be resolved through the literature-based note, we tried one last search through any available web and print sources to see if any accessible information could resolve the conflict.
  • Any remaining conflicts that cannot be merged through the previous rules we coded to ambiguous in the final dataset.

Names of crops grown

  • We take the union of all crops listed for the “Crops grown” variables.
  • For crop varieties grown by men and women, there were several cases in which the responses provided have conflicting or incorrectly answered codes5. For these cases, we drop all the conflicting crops and keep any varieties that are unique to the respective gender

Special Cases

  • As there are no literature-based codes for the Ndau, Pedi, Sukuma, Tonga, and Tumbuka groups, we take the most frequently occurring value provide the final merged code for each variable. We note that the responses for political and kinship leadership as well as solidarity groups are the most harmonized, while there was relatively more variation in the other variables.

We use this procedure for the groups listed in the table below. The number of literature and survey-based codes used to generate a final “merged” code are noted in the Literature and Survey columns, respectively. Any additional considerations are noted in the Note column.

Overlapping Codes from Traditional Data Collection.
Group Literature Survey Note
Akyem 1 3 We use the response given by an academic who identified themself as a member of the ethnic group.
Anuak 1 1
Ashanti 1 1
Birom 1 1
Chewa 1 1
Duala 1 1
Egba 1 1
Fanti 1 3
Ganda 1 7
Gusii 1 2
Igbira 1 1
Jie 1 2
Kamba 1 1
Konjo 1 1 Given the sparsity of data in literature-based coding of the Konjo, we elected to defer to the survey response in all cases of conflict with literature-based codes.
Lango 1 1
Lozi 1 1
Luba 1 4 As there was considerable variation in the agricultural division of labor survey responses, we retained the original literature-based code for variables in which there was no value that occurred most frequently than the rest.
Luo 1 3
Masai 1 2 From our reading of the literature and the survey responses, it seems that the practice of cultivation varies across subgroups. We left the agricultural division of labor codes ambiguous but kept the data for crop varieties grown.
Ndau 0 11
Ngombe 1 3
Pedi 0 8
Sukuma 0 2
Thonga 1 1
Tonga 0 9
Toro 1 1
Tumbuka 0 2
Yanzi 1 1

Ultimately, this process yields set ethnographic codes for 317 distinct ethnic groups.

Merging collected data to spatial datasets

To visualize the spatial distribution of the traditional ethnographic data, we map the encoded data to Murdock (1959)’s Ethnolinguistic Map (Footnote: The original map published in Murdock 1959 is digitized by Nathan Nunn and available at his website.).As the correspondence between the groups listed in Murdock (1967) and the spatial regions named in Murdock (1959)’s map, we use a correspondence between the two datasets compiled by Fenske (2014)6. One problem that arises from this approach is that there exist a subset of groups in which multiple groups Murdock (1967) correspond to a single spatial region in Murdock (1959). We address this by providing two distinct datasets:

  1. A “complete” dataset of distinct Sub-Saharan groups drawn directly from Murdock (1967) (n = 317)
  2. A “collapsed” dataset of groups with a one-to-one correspondence between the collected data and the Ethnolinguistic Map (n = 295). We identify all cases where the mapping is not one-to-one and manually merge the conflicting codes using the following rules:

    • Non-missing codes take precedence over ambiguous codes which take precedence over missing codes.
    • Cases in which non-missing codes conflict are resolved by using coding note in the literature-based data.
    • Cases in which the non-missing codes cannot be merged using the coding note are coded to ambiguous.

There are groups merged in the “collapsed” data for which the above rules do not apply. For the following groups, merging for the purposes of mapping is challenged by the fact that Fenske (2014) matches the groups based on location. At times, this results in groups that live in close proximity to one another but are clearly very different from one another.

  1. The Lese and the Mbuti both correspond to the Lese region in Murdock (1959)7. In reality these are distinct groups that live in close proximity to one another and by most accounts form a symbiotic relationship in terms of trade. The primary difference is that the Lese practice sedentary agricultural cultivation while the Mbuti do not.
  2. The Nandi and Tiriki people both correspond to the Nandi region in Murdock (1959)8. These groups live in close proximity of one another but have very different customs, with the Nandi demonstrating more political development in the pre-colonial period than the Tiriki.
  3. The “Hill Suk” and “Plains Suk” are subgroups of the Pokot (Suk) people divided by their primary subsistence activity. The “Hill Suk”" practice agriculture while the “Plains Suk” are pastoralists.

The “collapsed” data can be directly mapped to the spatial data in Murdock (1959) while the complete data includes all collected data. We include a set of dummies in the “complete” data that indicate which groups have a one-to-one mapping with Murdock (1959) before merging, groups that have a one-to-one mapping with Murdock (1959) after merging, and cases in which there was substantial disagreement between the encodings of each group. To provide an alternative solution for those wishing to use the spatial data for all the distinct groups the complete dataset without manually merging the conflicts, we also include the geographic coordinates of each group’s centroid, which is provided by Murdock (1967).

Important Considerations and Limitations of the Data

The present study attempts to encode information regarding gender dynamics for pre-colonial indigenous societies in Africa from qualitative information recorded in a broad range of ethnographic literature. Use of the final dataset requires the researcher must acknowledge several limitations of the data collection process.

First, our objective is to characterize each ethnic group in its state prior to the onset European colonial influence. For the literature-based phase, a challenge to this objective is simply the availability of pre-colonial ethnographic information across groups. Some ethnic groups are well-documented by anthropologists while others have only information written about the group in the post-colonial period. We acknowledge this issue relying on the earliest available data wherever possible and flagging any data that seems to be confounded with colonial influence.

The second limitation lies in the rate at which authors of these ethnographic studies explicitly mention gender characteristics for each society. For sources in which there were no direct mentions of any divisions by gender, the undergraduate RAs were instructed to code each variable on the basis of gender pronouns or other terms that signify gender9. It is worth noting that should the use of gender pronouns by earlier authors deviate from modern sociological considerations of gender, the data in the literary surveys may generate a qualitative bias towards only or predominantly male rulers.

References

Fenske, James. 2014. “Ecology, Trade, and States in Pre-Colonial Africa.” Journal of the European Economic Association 12 (3). Wiley Online Library:612–40.

Murdock, George Peter. 1959. Africa: Its Peoples and Their Culture History. McGraw-Hill.

———. 1967. “Ethnographic Atlas: A Summary.” Ethnology 6 (2). JSTOR:109–236.

Sanday, Peggy Reeves. 1981. Female Power and Male Dominance: On the Origins of Sexual Inequality. Cambridge University Press.


  1. It must be explicitly mentioned that solidarity groups are non-existent within the society.

  2. At most, a set of spouses and their immediate offspring.

  3. In other words, we add the list the varieties grown to the variable capturing crops by the group as a whole. We do not duplicate this value for the variable capturing crops grown by women.

  4. Including mythical stories, historical stories, and any mix of the two.

  5. For example, listing a single crop as a variety grown by both men and women

  6. Fenske’s correspondence seems to be the only publicly available source for matching the two data sources. It should be noted that this correspondence contains some inconsistencies. For example, Fenske contends the Lese are an alternative name for the Mbuti. In reality, these are completely distinct groups that traditionally live in close proximity and cooperatively trade labor and goods.

  7. A match made by Fenske (2014) according to location.

  8. A match made by Fenske (2014) according to location

  9. References only to headmen or chiefs may obscure the actual presence of headwomen or chieftainesses if the author is not explicitly state that females have the potential to serve in leadership roles.