Overview
Dataset statistics
| Number of variables | 9 |
|---|---|
| Number of observations | 537 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Total size in memory | 38.3 KiB |
| Average record size in memory | 73.0 B |
Variable types
| Numeric | 8 |
|---|---|
| Boolean | 1 |
Pregnancies has 76 (14.2%) zeros | Zeros |
BloodPressure has 19 (3.5%) zeros | Zeros |
SkinThickness has 154 (28.7%) zeros | Zeros |
Insulin has 261 (48.6%) zeros | Zeros |
Reproduction
| Analysis started | 2026-03-12 11:28:35.345016 |
|---|---|
| Analysis finished | 2026-03-12 11:28:35.365284 |
| Duration | 0.02 seconds |
| Software version | ydata-profiling vv4.18.1 |
| Download configuration | config.json |
Variables
Pregnancies
Real number (ℝ)
Zeros
| Distinct | 17 |
|---|---|
| Distinct (%) | 3.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.726256983 |
| Minimum | 0 |
|---|---|
| Maximum | 17 |
| Zeros | 76 |
| Zeros (%) | 14.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 6 |
| 95-th percentile | 10 |
| Maximum | 17 |
| Range | 17 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.262964891 |
|---|---|
| Coefficient of variation (CV) | 0.8756682392 |
| Kurtosis | 0.3572700088 |
| Mean | 3.726256983 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.9228620592 |
| Sum | 2001 |
| Variance | 10.64693988 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=17)
| Value | Count | Frequency (%) |
| 1 | 106 | |
| 0 | 76 | |
| 2 | 65 | |
| 3 | 54 | |
| 4 | 45 | |
| 5 | 42 | 7.8% |
| 6 | 39 | 7.3% |
| 7 | 31 | 5.8% |
| 8 | 26 | 4.8% |
| 9 | 19 | 3.5% |
| Other values (7) | 34 | 6.3% |
| Value | Count | Frequency (%) |
| 0 | 76 | |
| 1 | 106 | |
| 2 | 65 | |
| 3 | 54 | |
| 4 | 45 |
| Value | Count | Frequency (%) |
| 17 | 1 | 0.2% |
| 15 | 1 | 0.2% |
| 14 | 2 | |
| 13 | 3 | |
| 12 | 4 |
Glucose
Real number (ℝ)
| Distinct | 128 |
|---|---|
| Distinct (%) | 23.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 120.849162 |
| Minimum | 0 |
|---|---|
| Maximum | 199 |
| Zeros | 5 |
| Zeros (%) | 0.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 79 |
| Q1 | 99 |
| median | 117 |
| Q3 | 139 |
| 95-th percentile | 181 |
| Maximum | 199 |
| Range | 199 |
| Interquartile range (IQR) | 40 |
Descriptive statistics
| Standard deviation | 32.33952292 |
|---|---|
| Coefficient of variation (CV) | 0.2676023762 |
| Kurtosis | 1.116704041 |
| Mean | 120.849162 |
| Median Absolute Deviation (MAD) | 19 |
| Skewness | 0.07338163583 |
| Sum | 64896 |
| Variance | 1045.844743 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 99 | 15 | 2.8% |
| 125 | 12 | 2.2% |
| 100 | 11 | 2.0% |
| 114 | 11 | 2.0% |
| 108 | 10 | 1.9% |
| 106 | 10 | 1.9% |
| 111 | 10 | 1.9% |
| 95 | 10 | 1.9% |
| 128 | 10 | 1.9% |
| 122 | 10 | 1.9% |
| Other values (118) | 428 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 44 | 1 | 0.2% |
| 56 | 1 | 0.2% |
| 57 | 2 | 0.4% |
| 65 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 199 | 1 | 0.2% |
| 198 | 1 | 0.2% |
| 197 | 3 | |
| 196 | 3 | |
| 195 | 2 |
BloodPressure
Real number (ℝ)
Zeros
| Distinct | 44 |
|---|---|
| Distinct (%) | 8.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 69.68528864 |
| Minimum | 0 |
|---|---|
| Maximum | 122 |
| Zeros | 19 |
| Zeros (%) | 3.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 44 |
| Q1 | 64 |
| median | 72 |
| Q3 | 80 |
| 95-th percentile | 90 |
| Maximum | 122 |
| Range | 122 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 18.09437396 |
|---|---|
| Coefficient of variation (CV) | 0.2596584489 |
| Kurtosis | 5.802639348 |
| Mean | 69.68528864 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | -1.831636222 |
| Sum | 37421 |
| Variance | 327.406369 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=44)
| Value | Count | Frequency (%) |
| 70 | 41 | 7.6% |
| 74 | 38 | 7.1% |
| 68 | 37 | 6.9% |
| 80 | 34 | 6.3% |
| 64 | 32 | 6.0% |
| 72 | 29 | 5.4% |
| 76 | 28 | 5.2% |
| 78 | 27 | 5.0% |
| 62 | 25 | 4.7% |
| 66 | 23 | 4.3% |
| Other values (34) | 223 |
| Value | Count | Frequency (%) |
| 0 | 19 | |
| 24 | 1 | 0.2% |
| 30 | 2 | 0.4% |
| 38 | 1 | 0.2% |
| 40 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 122 | 1 | |
| 110 | 2 | |
| 106 | 2 | |
| 104 | 2 | |
| 102 | 1 |
SkinThickness
Real number (ℝ)
Zeros
| Distinct | 47 |
|---|---|
| Distinct (%) | 8.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20.4320298 |
| Minimum | 0 |
|---|---|
| Maximum | 63 |
| Zeros | 154 |
| Zeros (%) | 28.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 23 |
| Q3 | 32 |
| 95-th percentile | 43.2 |
| Maximum | 63 |
| Range | 63 |
| Interquartile range (IQR) | 32 |
Descriptive statistics
| Standard deviation | 15.49071515 |
|---|---|
| Coefficient of variation (CV) | 0.7581584063 |
| Kurtosis | -1.127757802 |
| Mean | 20.4320298 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -0.02648576357 |
| Sum | 10972 |
| Variance | 239.9622558 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=47)
| Value | Count | Frequency (%) |
| 0 | 154 | |
| 30 | 22 | 4.1% |
| 32 | 21 | 3.9% |
| 23 | 18 | 3.4% |
| 27 | 15 | 2.8% |
| 25 | 14 | 2.6% |
| 39 | 14 | 2.6% |
| 19 | 13 | 2.4% |
| 18 | 13 | 2.4% |
| 28 | 13 | 2.4% |
| Other values (37) | 240 |
| Value | Count | Frequency (%) |
| 0 | 154 | |
| 8 | 1 | 0.2% |
| 10 | 4 | 0.7% |
| 11 | 5 | 0.9% |
| 12 | 6 | 1.1% |
| Value | Count | Frequency (%) |
| 63 | 1 | 0.2% |
| 60 | 1 | 0.2% |
| 52 | 1 | 0.2% |
| 51 | 1 | 0.2% |
| 50 | 3 |
Insulin
Real number (ℝ)
Zeros
| Distinct | 153 |
|---|---|
| Distinct (%) | 28.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 79.83612663 |
| Minimum | 0 |
|---|---|
| Maximum | 846 |
| Zeros | 261 |
| Zeros (%) | 48.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 36 |
| Q3 | 129 |
| 95-th percentile | 291.4 |
| Maximum | 846 |
| Range | 846 |
| Interquartile range (IQR) | 129 |
Descriptive statistics
| Standard deviation | 115.1967297 |
|---|---|
| Coefficient of variation (CV) | 1.442914812 |
| Kurtosis | 8.017463042 |
| Mean | 79.83612663 |
| Median Absolute Deviation (MAD) | 36 |
| Skewness | 2.358206734 |
| Sum | 42872 |
| Variance | 13270.28653 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 261 | |
| 140 | 8 | 1.5% |
| 94 | 7 | 1.3% |
| 105 | 7 | 1.3% |
| 120 | 7 | 1.3% |
| 180 | 6 | 1.1% |
| 130 | 6 | 1.1% |
| 100 | 5 | 0.9% |
| 110 | 5 | 0.9% |
| 135 | 5 | 0.9% |
| Other values (143) | 220 |
| Value | Count | Frequency (%) |
| 0 | 261 | |
| 14 | 1 | 0.2% |
| 18 | 1 | 0.2% |
| 22 | 1 | 0.2% |
| 23 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 846 | 1 | |
| 744 | 1 | |
| 600 | 1 | |
| 543 | 1 | |
| 540 | 1 |
BMI
Real number (ℝ)
| Distinct | 218 |
|---|---|
| Distinct (%) | 40.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 31.97560521 |
| Minimum | 0 |
|---|---|
| Maximum | 67.1 |
| Zeros | 5 |
| Zeros (%) | 0.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 21.8 |
| Q1 | 26.8 |
| median | 32 |
| Q3 | 36.5 |
| 95-th percentile | 44.52 |
| Maximum | 67.1 |
| Range | 67.1 |
| Interquartile range (IQR) | 9.7 |
Descriptive statistics
| Standard deviation | 7.624495387 |
|---|---|
| Coefficient of variation (CV) | 0.238447258 |
| Kurtosis | 2.755199972 |
| Mean | 31.97560521 |
| Median Absolute Deviation (MAD) | 4.8 |
| Skewness | -0.1080921598 |
| Sum | 17170.9 |
| Variance | 58.1329299 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 32 | 11 | 2.0% |
| 31.6 | 9 | 1.7% |
| 31.2 | 9 | 1.7% |
| 32.4 | 8 | 1.5% |
| 32.8 | 8 | 1.5% |
| 30.1 | 7 | 1.3% |
| 29.7 | 6 | 1.1% |
| 39.4 | 6 | 1.1% |
| 27.8 | 6 | 1.1% |
| 25.9 | 6 | 1.1% |
| Other values (208) | 461 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 18.2 | 3 | |
| 19.3 | 1 | 0.2% |
| 19.4 | 1 | 0.2% |
| 19.5 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 67.1 | 1 | |
| 59.4 | 1 | |
| 55 | 1 | |
| 52.9 | 1 | |
| 49.7 | 1 |
DiabetesPedigreeFunction
Real number (ℝ)
| Distinct | 402 |
|---|---|
| Distinct (%) | 74.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4699199255 |
| Minimum | 0.078 |
|---|---|
| Maximum | 2.42 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 KiB |
Quantile statistics
| Minimum | 0.078 |
|---|---|
| 5-th percentile | 0.137 |
| Q1 | 0.241 |
| median | 0.374 |
| Q3 | 0.612 |
| 95-th percentile | 1.1364 |
| Maximum | 2.42 |
| Range | 2.342 |
| Interquartile range (IQR) | 0.371 |
Descriptive statistics
| Standard deviation | 0.3420873633 |
|---|---|
| Coefficient of variation (CV) | 0.7279694788 |
| Kurtosis | 6.847192339 |
| Mean | 0.4699199255 |
| Median Absolute Deviation (MAD) | 0.167 |
| Skewness | 2.158685873 |
| Sum | 252.347 |
| Variance | 0.1170237641 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.258 | 5 | 0.9% |
| 0.268 | 5 | 0.9% |
| 0.261 | 4 | 0.7% |
| 0.26 | 3 | 0.6% |
| 0.299 | 3 | 0.6% |
| 0.235 | 3 | 0.6% |
| 0.19 | 3 | 0.6% |
| 0.245 | 3 | 0.6% |
| 0.292 | 3 | 0.6% |
| 0.205 | 3 | 0.6% |
| Other values (392) | 502 |
| Value | Count | Frequency (%) |
| 0.078 | 1 | |
| 0.084 | 1 | |
| 0.085 | 2 | |
| 0.088 | 1 | |
| 0.089 | 1 |
| Value | Count | Frequency (%) |
| 2.42 | 1 | |
| 2.329 | 1 | |
| 2.288 | 1 | |
| 2.137 | 1 | |
| 1.893 | 1 |
Age
Real number (ℝ)
| Distinct | 51 |
|---|---|
| Distinct (%) | 9.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 33.0744879 |
| Minimum | 21 |
|---|---|
| Maximum | 81 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 KiB |
Quantile statistics
| Minimum | 21 |
|---|---|
| 5-th percentile | 21 |
| Q1 | 24 |
| median | 29 |
| Q3 | 41 |
| 95-th percentile | 57 |
| Maximum | 81 |
| Range | 60 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 11.68531899 |
|---|---|
| Coefficient of variation (CV) | 0.3533030967 |
| Kurtosis | 0.8842483866 |
| Mean | 33.0744879 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 1.169307637 |
| Sum | 17761 |
| Variance | 136.54668 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 22 | 52 | 9.7% |
| 21 | 46 | 8.6% |
| 25 | 35 | 6.5% |
| 24 | 31 | 5.8% |
| 23 | 26 | 4.8% |
| 28 | 24 | 4.5% |
| 26 | 24 | 4.5% |
| 27 | 22 | 4.1% |
| 31 | 18 | 3.4% |
| 29 | 18 | 3.4% |
| Other values (41) | 241 |
| Value | Count | Frequency (%) |
| 21 | 46 | |
| 22 | 52 | |
| 23 | 26 | |
| 24 | 31 | |
| 25 | 35 |
| Value | Count | Frequency (%) |
| 81 | 1 | |
| 72 | 1 | |
| 70 | 1 | |
| 69 | 2 | |
| 68 | 1 |
| Value | Count | Frequency (%) |
| False | 349 | |
| True | 188 |