Techniques in Handling Categorical Variables Discussion Responses

  • Discussion 1
    In statistics, the concept of a categorical attribute refers to a component in a model that can
    only take on certain predetermined values. As a result, the various choices for the component do
    not come together to produce a continuous gradient (as in the case of length or time). A few
    examples of categorical qualities include things like colors, different kinds of forms, names of
    persons or organizations, and so on.
    According to Barker (1980), categorical data types are qualities that are handled as if they
    were separate symbols or merely names. Because it may take on values such as black, green, blue,
    gray, etc., the color of the iris of the human eye is an example of a categorical data type. Because
    there is no direct link between the data values, it is not possible to apply any mathematical
    operators—with the exception of the logical or “is equal” operator—to the data. It is important to
    keep in mind that certain categorical qualities may sometimes be expressed as continuous or
    smooth traits.
    A continuous attribute is one that accepts values determined via the use of measurement.
    As an example, a person’s height is a continuous dimension since it can be measured. Continuous
    data are not limited to individual values that have been established, but rather may occupy any
    value within a range. A person’s weight may be expressed as 90 pounds, 90.5 pounds, 90.12
    pounds, or 90.345 pounds, and so on. Numeric representations are always used for continuous
    Continuous data provide for a little more wiggle room when it comes to accuracy and also
    make it possible to interpolate (Tan, 2018). The concept of proportion is given its full significance.
    You probably don’t count the quantity of molecules in your bottle of milk for each individual
    ingredient that makes up milk unless you are a scientist. You are making a pastry, right? For
    instance, the quantity of milk is measured in liters. If your oven is too small for the cake recipe
    you want to make, it is typically OK to cut all of the ingredients in half. You won’t be able to tell
    the difference between a piece of this interpolated cake and the identical piece of cake you would
    have gotten if you had followed the original recipe.
    Barker, K. N. (1980). Data Collection Techniques: Observation.
    Tan, P.-N. (2018). Introduction to Data Mining. Pearson.
    Discussion 2:
    Question 1
    There are three main techniques for handling categorical attributes namely nominal,
    ordinal, and cardinal (Mougan et al., 2023). When deciding which one to use, the first thing you
    should consider is whether the information in each category is equal or not. If it is not equal, you
    should think about whether or not the order matters.
    Question 2
    Continuous attributes are those that have a numeric value, such as height, weight, or age.
    On the other hand, categorical attributes are those that are named groups of things, such as
    gender or ethnic group. Continuous and categorical attributes differ from each other in several
    ways (Mary et al., 2019). In general, continuous variables quantify a quantity in terms of an
    amount or extent of something, while categorical variables classify people according to certain
    Question 3
    A concept hierarchy is a way of organizing concepts into levels based on their relatedness
    (Weng & Luo, 2023). It is sometimes called a topic tree, mind map, or taxonomy. The purpose of
    a concept hierarchy is to enhance the ease with which people can find the information they need.
    Question 4
    There are several common patterns that include the following. Firstly, it is a linear pattern
    that indicates a direct relationship between two factors, such as the amount of time spent doing
    an assigned task and the accuracy of the task (Hewamalage et al., 2022). A non-linear pattern
    usually indicates that there is a threshold value below which something happens reliably but
    above it there is either no effect at all or an opposite effect. A cyclical pattern shows up when
    two factors influence each other in cycles or cycles are caused by one factor e.g., temperature
    and seasons (Hewamalage et al., 2022).
    Hewamalage, H., Bergmeir, C., & Bandara, K. (2022). Global models for time series forecasting:
    A simulation study. Pattern Recognition, 124, 108441.
    Mary, J., Calauzenes, C., & El Karoui, N. (2019, May). Fairness-aware learning for continuous
    attributes and treatments. In International Conference on Machine Learning (pp. 43824391). PMLR.
    Mougan, C., Álvarez, J. M., Ruggieri, S., & Staab, S. (2023, August). Fairness implications of
    encoding protected categorical attributes. In Proceedings of the 2023 AAAI/ACM
    Conference on AI, Ethics, and Society (pp. 454-465).
    Weng, W., & Luo, W. (2023). A Comparative Analysis of Data Mining Methods and
    Hierarchical Linear Modeling Using PISA 2018 Data. International Journal of Database
    Management Systems (IJDMS) Vol, 15.

