What is the primary distinction between supervised learning and unsupervised learning in machine learning as discussed in the lecture?
A. Supervised learning requires both input and output data (labeled data), whereas unsupervised learning only requires input data without labeled outputs.
B. Supervised learning uses unlabeled data to discover hidden patterns, while unsupervised learning uses labeled data to train the model.
C. Both supervised and unsupervised learning require labeled data, but supervised learning focuses on predicting the output labels.
D. Unsupervised learning is primarily used for reinforcement tasks, while supervised learning is used for tasks that require interaction with the environment.
Which Python library, mentioned in the lecture, is considered fundamental for scientific computing due to its optimized C code and ability to handle high-performance N-dimensional array objects?
A. Matplotlib
B. Numpy
C. SciPy
D. Pandas
According to the lecture, which learning paradigm involves learning through interaction with the environment, relying on trial, error, and reward feedback?
A. Reinforcement learning
B. Unsupervised learning
C. Supervised learning
D. Semi-supervised learning
In the context of the lecture, which representation method is typically used for images when applying machine learning models, as seen in tasks like image recognition?
A. One-hot Encoding
B. Continuous Numeric Encoding
C. Feature Vector model
D. Bag of Words model
In class, we briefly discussed about the application of machine learning models for analyzing text inputs. Before training/inferencing with the model, it is a common practice to transform raw text data into a format that is more easily analyzed and processed by algorithms, for example, by breaking the sentence into words and convert them into numerical representations (“tokens”).
In general, what is the name of the preprocessing step that converts a document into a list of tokens?
In class, we discussed the representation method for converting an image input into a feature vector, through a technique called “flattening”. For a greyscale image, say an 8-bit image of the size 100*100 (height * width), the feature vector is constructed by vertically stacking pixels in each row, as shown below
where the 2-D matrix stores the pixel value (in range 0~255 for an 8-bit image) at each spatial location of the image.
Now, suppose we need to convert a 100*100 RGB image into its feature vector. The RGB image is given by three 2-D matrices $(R_{100 \times 100}, G_{100 \times 100}, B_{100 \times 100})$, where each is a 100*100 matrix corresponding to a color channel. After flattening, the feature vector should be
\[\begin{array}{c} x = \begin{bmatrix} \text{Row 1 of } R\\ \text{Row 1 of } G \\ \text{Row 1 of } B \\ ...\\ \text{Row 100 of } R\\ \text{Row 100 of } G \\ \text{Row 100 of } B \\ \end{bmatrix} \end{array}\]True or False?
Using the notation discussed in class about data representation, what should be the notation of feature 11 of sample 12?
A. $x_{12}^{(11)}$
B. $x_{11}^{(12)}$
C. $x_{11_{(12)}}$
D. $x_{11}^{[12]}$
Using the notations from class, we define
\[\begin{array}{c} x^{(j)} = \begin{bmatrix} x^{(j)}_1\\ x^{(j)}_2\\ ...\\ x^{(j)}_m \end{bmatrix} \end{array}\]to be a feature vector of the m features of the j-th sample.
Let $X$ denote the input matrix of $n$ samples ${x^{(i)}}_{i=1}^n$ and each is measured on m features. In class, we discussed about the two different ways people use to represent $X$. Although both are legal, which one is more common and better aligned with Numpy (as well as many other machine learning libraries), and what is X.shape?
A. $X = \begin{bmatrix} x^{(1)} \ x^{(2)} \ \ldots \ x^{(n)} \end{bmatrix}$$, X.shape = (m, n). So we represent feature matrices with samples as rows and features as columns.
B. $\begin{array}{c} X = \begin{bmatrix} {x^{(1)}}\ {x^{(2)}}\ …\ {x^{(n)}} \end{bmatrix} \end{array}$, X.shape = (nm, 1). So we represent feature matrices by stacking the n feature vectors.
C. $\begin{array}{c} X = \begin{bmatrix} {x^{(1)}}^T\ {x^{(2)}}^T\ …\ {x^{(n)}}^T \end{bmatrix} \end{array}$, X.shape = (n, m). So we represent feature matrices with samples as rows and features as columns.
D. $X = \begin{bmatrix} {x^{(1)}} & {x^{(2)}} & … & {x^{(n)}} \end{bmatrix}$, X.shape = (m, n). So we represent feature matrices with samples as columns and features as rows.
When fitting our model with the training data, as we increase the number of parameters in the model, the model will be able to learn and represent more complex relationships between the features and output, and therefore better fit the data. We say the model has more ________.
[Select all that are correct] Using numpy (loaded as “np”), what are the ways to create a 3-by-3 identity matrix $M$, i.e.
\[\begin{array}{c} M = \begin{bmatrix} 1&0&0\\ 0&1&0\\ 0&0&1 \end{bmatrix} \end{array}\]A:
B.
M = np.diag([1, 1, 1])
C.M = np.ones([3, 3])
D.M = np.eye(3)