Harvard Mark I 1940s

Monday, November 27, 2017

Modeling Random Samples from Normal Distributions with OpenOffice Calc & C++ Part I


Modeling Random Samples from Normal Distributions with OpenOffice Calc & C++
Part I


I shall present some observations on the modeling of taking random samples from given normal distributions and their approximations of the general statistical summaries of said normal distributions. The given normal distributions were generated in an OpenOffice Calc spreadsheet using the function NORMINV(num; mean; stddev). I chose rand() for num, 100 for mean and standard deviation 34.135 to generate normal distributions of sizes 1,000, 10,000, and 350,000. I also chose rand(), a mean of 134.135, and a standard deviation of 34.135 to generate normal distributions of sizes 500, 5,000 and 50,000. All normal distributions were of real valued numbers.

The C++ programs samples and errors read in the normal distributions which have been generated by OpenOffice and saved in the comma delimited text .csv format and read them into a standard floating point value C++ vector. When the given normal distribution had been read into the vector it is loaded into a custom C++ statistical calculations class I have implemented and a general statistical summary report for the entire population is printed at the beginning of a text file. After this is accomplished random samples from size 2 up to the size of the population of the given normal distribution are taken and each sample's general statistical summary is added as a row to a table in the text file, all of which can of course be compared to the initial general statistical summary of the given normal distribution considered as an entire population. Below is an example of a test run of the text file output for a particular run of ./samples 500.csv at the bash terminal command line, or samples 500.csv at the Windows command prompt.

 population
 n = 500
 mean = 101.367
 median = 100.863
 variance = 1032.66
 standard deviation = 32.135
 mean deviation = 25.7281
 median deviation = 22.2732
 skewnewss = -0.10289
 median skewness = 0.0470683
 random samples
  size     mean   median      var      std  meandev   mdndev      skw   mdnskw
     2  108.461  108.461 2996.376   54.739   38.706   38.706   -0.000    0.000
     3   97.253   92.826 1545.216   39.309   27.557   32.481    0.111    0.338
     4   89.384   87.200  967.988   31.113   25.465   23.281    0.095    0.211
     5  111.371  116.794  366.891   19.154   14.484   15.326   -0.105   -0.849
     6   99.451   98.133  798.427   28.256   23.507   20.010    0.116    0.140
     7  107.015  117.532 2015.066   44.889   33.987   29.466   -0.178   -0.703
     8   85.060   94.501 2182.623   46.719   31.297   20.638   -1.102   -0.606
                                   . . .
   492  101.266  100.863 1045.855   32.340   25.884   22.710   -0.097    0.037
   493  101.477  100.967 1017.018   31.891   25.506   22.102   -0.097    0.048
   494  101.216  100.659 1038.744   32.230   25.741   22.283   -0.096    0.052
   495  101.336  100.967 1034.720   32.167   25.718   22.268   -0.111    0.034
   496  101.572  100.991 1033.031   32.141   25.724   22.381   -0.106    0.054
   497  101.343  100.967 1026.704   32.042   25.636   22.268   -0.113    0.035
   498  101.290  100.863 1021.343   31.958   25.577   22.188   -0.127    0.040
   499  101.314  100.759 1035.407   32.178   25.726   22.259   -0.099    0.052
   500  101.367  100.863 1034.730   32.167   25.728   22.273   -0.103    0.047

The C++ code can be found at: 
 

No comments:

Post a Comment

Variations om mergesort. Part I. MIT Scheme

I'll be demonstrating some sample code for variations on the merge sort algorithm in various computer coding systems. The breakdown into...