November 18, 2024
Part 5.3: L1 and L2 Regularization to Decrease Overfitting
The L1 and L2 regularization is added to the network
in the model.add() function as a parameter:
model.add(Dense(25, activation='relu',activity_regularizer=regularizers.l1(1e-4)))
Regression vs Classification
Regression is used when the prediciton applies to a numerical value.
In the examples, J. Heaton uses the data from jh-simple-dataset.csv
and the prediction is made for the age column:
y = df['age'].values
Classification us used when the prediction applies to a set of discrete values.
In the examples, J. Heaton uses the data from jh-simple-dataset.csv
and the prediction is made for the age column:
dummies = pd.get_dummies(df['product']) # Classification
products = dummies.columns
y = dummies.values
From the scikit-learn web page:
https://scikit-learn.org/stable/
Classification: Identifying which category an object belongs to.
Regression: Predicting a continuous-valued attribute associated with an object.
Part 5.4: Drop Out for Keras to Decrease Overfitting
The Dropout layer is added to the network
in the model.add() function as a parameter:
model.add(Dropout(0.5))
November 19, 2024
Original code causes error
AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?
in line
oos_y.append(y_test)
Problem was:
I didn't indent the code correctly, the code after
# Build the oos prediction list and calculate the error.
was indented and as such part of the for loop.
After moving the code block to the front, it worked.
November 21, 2024
In the ShuffleSplit example of scikit-learn
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.ShuffleSplit.html
The input data is
X = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [3, 4], [5, 6]])
and the target data is
y = np.array([1, 2, 1, 2, 1, 2])
What does this mean?
Does it mean that
- an input of [1,2] is to result in a output of 1
- an input of [3,4] is to result in a output of 2
- an input of [5,6] is to result in a output of 1
etc.?