Algorithm Details: Histogram of all Numeric Values

A histogram is an accurate representation of the distribution of numerical data. It differs from a bar graph, in the sense that a bar graph relates two variables, but a histogram relates only one. To construct a histogram, the first step is to select the range of values—that is, divide the entire range of values into a series of intervals—and then count how many values fall into each interval. The intervals are usually specified as consecutive, non-overlapping intervals of a variable. The bins (intervals) must be adjacent, and are often (but are not required to be) of equal size. In the portal CSM histogram algorithm: 

  • Examination of separate variables (shape of the distribution, its variability, central tendency, and other simple statistics) in order to detect systematic data collection errors or fabricated data.
  • For every selected variable creates a histogram and calculates simple statistics: range, mean, standard deviation, number of missing values and some others.

[1] Oxford math center Probability histogram definition. (http://www.oxfordmathcenter.com/drupal7/node/294)

[2] https://www.edrawsoft.com/histograms-data-analysis.php

[3] https://en.wikipedia.org/wiki/Histogram

[4] https://statistics.laerd.com/statistical-guides/understanding-histograms.php

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.