Title: | Identify Similar Cases for Qualitative Case Studies |
---|---|
Description: | Allows users to identify similar cases for qualitative case studies using statistical matching methods. |
Authors: | Rich Nielsen |
Maintainer: | Rich Nielsen <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1.0 |
Built: | 2025-03-01 04:20:14 UTC |
Source: | https://github.com/cran/caseMatch |
This package uses statistical matching to identify "most similar" cases in a quantitative data set for subsequent qualitative analysis. Unlike existing matching packages, this package intended to meet some specific needs of analysts using matching for case studies.
Use the case.match
function.
Maintainer: Rich Nielsen <[email protected]>
Nielsen, Richard. 2016. "Case Selection via Matching," Sociological Methods and Research, 45 (3): 569-597. http://journals.sagepub.com/doi/abs/10.1177/0049124114547054
data(EU) mvars <- c("socialist","rgdpc","FHc","FHp","trade") dropvars <- c("countryname","population") ## In this example, I subset to the first 40 obs. to cut run-time out <- case.match(data=EU[1:40,], id.var="countryname", leaveout.vars=dropvars, distance="mahalanobis", case.N=2, greedy.match="pareto", number.of.matches.to.return=10, treatment.var="eu", max.variance=TRUE) out$cases
data(EU) mvars <- c("socialist","rgdpc","FHc","FHp","trade") dropvars <- c("countryname","population") ## In this example, I subset to the first 40 obs. to cut run-time out <- case.match(data=EU[1:40,], id.var="countryname", leaveout.vars=dropvars, distance="mahalanobis", case.N=2, greedy.match="pareto", number.of.matches.to.return=10, treatment.var="eu", max.variance=TRUE) out$cases
Uses matching methods to select cases for qualitative analysis
case.match(data, id.var, case.N = 2, distance = "mahalanobis", design.type = "most similar", match.case = NULL, greedy.match="pareto", number.of.matches.to.return = 1, treatment.var = NULL, outcome.var= NULL, leaveout.vars = NULL, max.variance = FALSE, max.variance.outcome=FALSE, variance.tolerance = 0.1, max.spread = FALSE, max.spread.outcome=FALSE, varweights = NULL)
case.match(data, id.var, case.N = 2, distance = "mahalanobis", design.type = "most similar", match.case = NULL, greedy.match="pareto", number.of.matches.to.return = 1, treatment.var = NULL, outcome.var= NULL, leaveout.vars = NULL, max.variance = FALSE, max.variance.outcome=FALSE, variance.tolerance = 0.1, max.spread = FALSE, max.spread.outcome=FALSE, varweights = NULL)
data |
A data frame. |
id.var |
A string variable that uniquely identifies cases within the data |
case.N |
The number of cases to choose. Must be 1 or more. |
distance |
The distance metric, specified as a string. Options are "mahalanobis", "euclidean", or "standardized", where "standardized" means that variables are standardized by their standard deviations. |
design.type |
Should the algorithm pick cases that are most similar or most different? Specify either "most similar" or "most different" as a string. |
match.case |
If specified, this is the value of |
number.of.matches.to.return |
How many matches to return. |
greedy.match |
Specifies which matches to return. Options are "pareto", "greedy", and "all". "all" keeps all matches. "pareto" matches eliminate 'redundant' matches where both units have better available matches. "greedy" matches keeps only the top matches in the data, but does eliminates best matches for some units since it uses a without replacement algorithm. |
treatment.var |
The name of the treatment variable, specified as a string. |
outcome.var |
The name of the outcome variable, specified as a string. |
leaveout.vars |
A vector of variables to not include in the matching. |
max.variance |
Should the cases be selected to maximize variance on |
max.variance.outcome |
Should the cases be selected to maximize variance on |
variance.tolerance |
The proportion of cases to consider if |
max.spread |
Should the cases be selected to maximize "spread" on the treatment variable, meaning that cases are selected to be have evenly values from the min of |
max.spread.outcome |
Should the cases be selected to maximize "spread" on the outcome variable, meaning that cases are selected to be have evenly values from the min of |
varweights |
An optional vector of variable weights. It must line up with the columns of the data after |
case.match
uses statistical matching to select cases in a quantitative data set for subsequent qualitative analysis in "most similar" and "most different" research designs.
case.match
returns a named list with the following elements:
cases |
A table of the matched cases. |
case.distances |
A list of the distances between matched cases. |
Rich Nielsen
Nielsen, Richard. 2016. "Case Selection via Matching," Sociological Methods and Research, 45 (3): 569-597. http://www.mit.edu/~rnielsen/Case
data(EU) mvars <- c("socialist","rgdpc","FHc","FHp","trade") dropvars <- c("countryname","population") ## In this example, I subset to the first 40 obs. to cut run-time out <- case.match(data=EU[1:40,], id.var="countryname", leaveout.vars=dropvars, distance="mahalanobis", case.N=2, number.of.matches.to.return=10, treatment.var="eu", max.variance=TRUE) out$cases ## Not run: ## All cases: ## Find the best matches of EU to non-EU countries out <- case.match(data=EU, id.var="countryname", leaveout.vars=dropvars, distance="mahalanobis", case.N=2, number.of.matches.to.return=10, treatment.var="eu", max.variance=TRUE) out$cases ## Find the best matches while downweighting political variables myvarweights <- c(1,1,.1,.1,.1) names(myvarweights) <- c("rgdpc","trade","FHp","FHc","socialist") myvarweights (case.match(data=EU, id.var="countryname", leaveout.vars=dropvars, distance="mahalanobis", case.N=2, number.of.matches.to.return=10, treatment.var="eu", max.variance=TRUE,varweights=myvarweights))$cases ## Find the best non-EU matches for Germany tabGer <- case.match(data=EU, match.case="German Federal Republic", id.var="countryname",leaveout.vars=dropvars, distance="mahalanobis", case.N=2, number.of.matches.to.return=10,max.variance=TRUE, treatment.var="eu") ## End(Not run)
data(EU) mvars <- c("socialist","rgdpc","FHc","FHp","trade") dropvars <- c("countryname","population") ## In this example, I subset to the first 40 obs. to cut run-time out <- case.match(data=EU[1:40,], id.var="countryname", leaveout.vars=dropvars, distance="mahalanobis", case.N=2, number.of.matches.to.return=10, treatment.var="eu", max.variance=TRUE) out$cases ## Not run: ## All cases: ## Find the best matches of EU to non-EU countries out <- case.match(data=EU, id.var="countryname", leaveout.vars=dropvars, distance="mahalanobis", case.N=2, number.of.matches.to.return=10, treatment.var="eu", max.variance=TRUE) out$cases ## Find the best matches while downweighting political variables myvarweights <- c(1,1,.1,.1,.1) names(myvarweights) <- c("rgdpc","trade","FHp","FHc","socialist") myvarweights (case.match(data=EU, id.var="countryname", leaveout.vars=dropvars, distance="mahalanobis", case.N=2, number.of.matches.to.return=10, treatment.var="eu", max.variance=TRUE,varweights=myvarweights))$cases ## Find the best non-EU matches for Germany tabGer <- case.match(data=EU, match.case="German Federal Republic", id.var="countryname",leaveout.vars=dropvars, distance="mahalanobis", case.N=2, number.of.matches.to.return=10,max.variance=TRUE, treatment.var="eu") ## End(Not run)
A cross-national data set including economic and political variables for 189 countries, averaged from 1980-1992.
data(EU)
data(EU)
A data frame with 185 observations on the following 13 variables.
countryname
The name of the country
population
Country population from Gleditsch.
rgdpc
GDP per capita from Gleditsch.
trade
Trade from Gleditsch.
FHp
Freedom House political rights.
FHc
Freedom House civil rights.
socialist
An indicator for countries that were socialist during the Cold War.
eu
An indicator for EU members.
A cross-national data set including economic and political variables for 189 countries, averaged from 1980-1992. Data are collected by Gleditsch and Freedom House.
Gleditsch, Kristian Skrede. (2004) Expanded Trade and GDP Data, Version 4.0. http://privatewww.essex.ac.uk/~ksg/exptradegdp.html
http://www.freedomhouse.org/report-types/freedom-world
Nielsen, Richard A. Forthcoming. "Case Selection Via Matching," Sociological Methods and Research. http://www.mit.edu/~rnielsen/Case Selection via Matching.pdf
data(EU)
data(EU)