Good And Bad Classification Of Dhokla
Description
For the project on "Good and Bad Classification of Dhokla," we aim to develop a machine learning model to classify Dhokla samples based on their quality. The dataset consists of 1,000 Dhokla samples, with an equal distribution of 500 good and 500 bad samples. The classification is binary, meaning that each sample is labeled either as "Good" or "Bad" based on its sensory, physical, or chemical properties. Data Collection: The dataset includes the following features for each Dhokla sample: Sensory Attributes: These are human-perceptible features that contribute to the overall quality of Dhokla. Texture Score (0-10): Measured through human assessment or mechanical methods, representing how spongy, soft, or firm the Dhokla is. Taste Score (0-10): A subjective assessment of the flavor, balanced between sourness, sweetness, and other taste profiles. Aroma Score (0-10): Measured by sensory experts based on how fresh or fermented the sample smells. Appearance Score (0-10): Evaluation of color, visual uniformity, and any signs of overcooking or undercooking. Physical Properties: Quantifiable aspects measured using specific instruments. Moisture Content (%): Indicates the level of hydration, which plays a crucial role in texture and shelf life. Density (g/cm³): A measure of the weight-to-volume ratio, giving an idea of how airy or dense the sample is. pH Level: Reflects the acidity, which is a key factor in fermentation and taste. Porosity (%): The percentage of air spaces in the sample, which influences texture and lightness. Chemical Composition: These attributes are measured through chemical analysis to understand the composition. Fermentation Level (CO₂ Production, ml): Indicates the degree of fermentation, which is essential for the traditional tangy flavor of Dhokla. Sugar Content (g): Reflects the amount of sugar present, which impacts taste and the fermentation process. Protein Content (g): A measure of protein, which affects both texture and nutritional value. Fat Content (g): This impacts the mouthfeel and caloric content of the Dhokla. Labels: Good (1): A Dhokla sample that scores high on sensory attributes, meets moisture and density requirements, and shows an optimal level of fermentation. Bad (0): A Dhokla sample that is too dense, dry, over- or under-fermented, or fails to meet sensory expectations. Data Distribution: 500 samples labeled as Good (1). 500 samples labeled as Bad (0). Data Usage: The dataset can be used to train and test machine learning models, particularly classifiers, to distinguish between good and bad Dhokla samples based on these sensory, physical, and chemical features. Potential models to consider include decision trees, support vector machines (SVM), and neural networks, depending on the complexity of the data and desired performance. Challenges: Imbalanced sensory scoring: Since sensory data is subjective, normalization or other pre-processing steps may be needed. Feature correlation: Some features, such as moistur