
These values, saves them as macro variables, and then puts them into the plot
#STATA MEAN BY GROUP CODE#
The code below starts from the original dataset to generate We may wish to add additional text to this plot, like p-values from ANOVA tests on race conducted within SES levels. Ytitle(Mean of `varname' by `group1' and `group2') yline(50, lpattern(dash)) /// (scatter y4 x, msymbol(D) msize(medium)) /// (scatter 圓 x, msymbol(T) msize(medium)) /// (scatter y2 x, msymbol(Oh) msize(large)) /// Twoway (scatter y1 x, msymbol(S) msize(medium)) /// The code below is more complex, butĬollapse (mean) y = `varname' (semean) se_y = `varname', by(`group1' `group2') Group of points, the labeling would be more explicit, and the dotted linesĪttaching the means would be removed. Ideally, different markers would be used for each of the four levels seen in one It if there are three levels of SES and four levels of race, or vice versa. While this graph is of the general desired form, it not clear from looking at Ytitle("Mean of `varname' by `group'") yline(50) legend(off) scheme(lean1) use, clearĬollapse (mean) y = `varname' (semean) se_y = `varname', by(`group') We start by generating the "bare bones" version of such a graph.
#STATA MEAN BY GROUP FULL#
"High prepregnant body mass index is associated with early termination of full This type of plot appeared inĪn article by Baker, et al, in The American Journal of Clinical Nutrition, Groups defined by two categorical variables.
#STATA MEAN BY GROUP HOW TO#
Now the intercept gives the mean price for domestic after controlling for length and weight and intercept plus coefficient on Foreign gives the mean price for Foreign after controlling for length and weight.The code below shows how to plot the means and confidence interval bars for Say, you want to control for weight and length reg price foreign weight length You need to use linear regression (again for simplicity, I am assuming linearity in all variables). # Now if you want to control other variables, there is only one way you can do that (of course there is matching which you already mentioned). If you compare two methods, you see that intercept in the first approach gives the mean price when foreign is domestic and coefficient on foreign plus the intercept gives the mean price for Foreign when foreign is Foreign. Use linear reg (for simplicity I am assuming linearity in variables) reg price foreign # without controls and if you want to find the mean of variable say price for foreign, where foreign consists of two groups (if foreign=0, domestic, and if foreign=1, it is Foreign). I was wondering if there is another way of doing this without using any matching methods but with some simple Stata commands.Įxample illustrated with auto data in Stata Then, I calculate the mean for immigrants and similar natives. One way to do that is using matching immigrants with natives with similar characteristics.

Therefore, I have to control for the group characteristics among natives and immigrants. Hence, the difference between simple means across groups can be a result of being a immigrant but can also be because of different characteristics. immigrants are younger, less educated, etc.

However, the characteristics of the two groups are different, i.e. To be more clear, let's say my groups are immigrants and natives. However, there are differences among two groups in terms of age, gender, education. I am trying to estimate the mean of a variable for 2 different groups.
