match.data function - RDocumentation (2024)

Description

match.data() and get_matches() create a data frame withadditional variables for the distance measure, matching weights, andsubclasses after matching. This dataset can be used to estimate treatmenteffects after matching or subclassification. get_matches() is mostuseful after matching with replacement; otherwise, match.data() ismore flexible. See Details below for the difference between them.

Usage

match.data( object, group = "all", distance = "distance", weights = "weights", subclass = "subclass", data = NULL, include.s.weights = TRUE, drop.unmatched = TRUE)

get_matches( object, distance = "distance", weights = "weights", subclass = "subclass", id = "id", data = NULL, include.s.weights = TRUE)

Value

A data frame containing the data supplied in the data argument or in theoriginal call to matchit() with the computedoutput variables appended as additional columns, named according thearguments above. For match.data(), the group anddrop.unmatched arguments control whether only subsets of the data arereturned. See Details above for how match.data() andget_matches() differ. Note that get_matches sorts the data bysubclass and treatment status, unlike match.data(), which uses theorder of the data.

The returned data frame will contain the variables in the original data setor dataset supplied to data and the following columns:

distance

The propensity score, if estimated or supplied to thedistance argument in matchit() as a vector.

weights

The computed matching weights. These must be used in effectestimation to correctly incorporate the matching.

subclass

Matchingstrata membership. Units with the same value are in the same stratum.

id

The ID of each unit, corresponding to the row names in theoriginal data or dataset supplied to data. Only included inget_matches output. This column can be used to identify which rowsbelong to the same unit since the same unit may appear multiple times ifreused in matching with replacement.

These columns will take on the name supplied to the corresponding argumentsin the call to match.data() or get_matches(). See Examples foran example of rename the distance column to "prop.score".

If data or the original dataset supplied to matchit() was adata.table or tbl, the match.data() output will havethe same class, but the get_matches() output will always be a base Rdata.frame.

In addition to their base class (e.g., data.frame or tbl),returned objects have the class matchdata or getmatches. Thisclass is important when using rbind() toappend matched datasets.

Arguments

object

a matchit object; the output of a call to matchit().

group

which group should comprise the matched dataset: "all"for all units, "treated" for just treated units, or "control"for just control units. Default is "all".

distance

a string containing the name that should be given to thevariable containing the distance measure in the data frame output. Defaultis "distance", but "prop.score" or similar might be a goodalternative if propensity scores were used in matching. Ignored if adistance measure was not supplied or estimated in the call tomatchit().

weights

a string containing the name that should be given to thevariable containing the matching weights in the data frame output. Defaultis "weights".

subclass

a string containing the name that should be given to thevariable containing the subclasses or matched pair membership in the dataframe output. Default is "subclass".

data

a data frame containing the original dataset to which thecomputed output variables (distance, weights, and/orsubclass) should be appended. If empty, match.data() andget_matches() will attempt to find the dataset using the environmentof the matchit object, which can be unreliable; see Notes.

include.s.weights

logical; whether to multiply the estimatedweights by the sampling weights supplied to matchit(), if any.Default is TRUE. If FALSE, the weights in thematch.data() or get_matches() output should be multiplied bythe sampling weights before being supplied to the function estimating thetreatment effect in the matched data.

drop.unmatched

logical; whether the returned data frame shouldcontain all units (FALSE) or only units that were matched (i.e., havea matching weight greater than zero) (TRUE). Default is TRUEto drop unmatched units.

id

a string containing the name that should be given to the variablecontaining the unit IDs in the data frame output. Default is "id".Only used with get_matches(); for match.data(), the units IDsare stored in the row names of the returned data frame.

Details

match.data() creates a dataset with one row per unit. It will beidentical to the dataset supplied except that several new columns will beadded containing information related to the matching. Whendrop.unmatched = TRUE, the default, units with weights of zero, whichare those units that were discarded by common support or the caliper or weresimply not matched, will be dropped from the dataset, leaving only thesubset of matched units. The idea is for the output of match.data()to be used as the dataset input in calls to glm() or similar toestimate treatment effects in the matched sample. It is important to includethe weights in the estimation of the effect and its standard error. Thesubclass column, when created, contains pair or subclass membership andshould be used to estimate the effect and its standard error. Subclasseswill only be included if there is a subclass component in thematchit object, which does not occur with matching with replacement,in which case get_matches() should be used. Seevignette("estimating-effects") for information on how to usematch.data() output to estimate effects.

get_matches() is similar to match.data(); the primarydifference occurs when matching is performed with replacement, i.e., whenunits do not belong to a single matched pair. In this case, the output ofget_matches() will be a dataset that contains one row per unit foreach pair they are a part of. For example, if matching was performed withreplacement and a control unit was matched to two treated units, thatcontrol unit will have two rows in the output dataset, one for each pair itis a part of. Weights are computed for each row, and, for control units, are equal to theinverse of the number of control units in each control unit's subclass; treated units get a weight of 1.Unmatched units are dropped. An additional column with unit IDs will becreated (named using the id argument) to identify when the same unitis present in multiple rows. This dataset structure allows for the inclusionof both subclass membership and repeated use of units, unlike the output ofmatch.data(), which lacks subclass membership when matching is donewith replacement. A match.matrix component of the matchitobject must be present to use get_matches(); in some forms ofmatching, it is absent, in which case match.data() should be usedinstead. See vignette("estimating-effects") for information on how touse get_matches() output to estimate effects after matching withreplacement.

See Also

matchit(); rbind.matchdata()

vignette("estimating-effects") for uses of match.data() andget_matches() in estimating treatment effects.

Examples

Run this code

data("lalonde")# 4:1 matching w/replacementm.out1 <- matchit(treat ~ age + educ + married + race + nodegree + re74 + re75, data = lalonde, replace = TRUE, caliper = .05, ratio = 4)m.data1 <- match.data(m.out1, data = lalonde, distance = "prop.score")dim(m.data1) #one row per matched unithead(m.data1, 10)g.matches1 <- get_matches(m.out1, data = lalonde, distance = "prop.score")dim(g.matches1) #multiple rows per matched unithead(g.matches1, 10)

Run the code above in your browser using DataLab

match.data function - RDocumentation (2024)
Top Articles
Latest Posts
Article information

Author: Rev. Porsche Oberbrunner

Last Updated:

Views: 6411

Rating: 4.2 / 5 (73 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Rev. Porsche Oberbrunner

Birthday: 1994-06-25

Address: Suite 153 582 Lubowitz Walks, Port Alfredoborough, IN 72879-2838

Phone: +128413562823324

Job: IT Strategist

Hobby: Video gaming, Basketball, Web surfing, Book restoration, Jogging, Shooting, Fishing

Introduction: My name is Rev. Porsche Oberbrunner, I am a zany, graceful, talented, witty, determined, shiny, enchanting person who loves writing and wants to share my knowledge and understanding with you.