3 Months Free Update
3 Months Free Update
3 Months Free Update
When training a deep neural network model, a loss function measures the difference between the model's predictions and the actual labels.
If a scanned document is not properly placed, and the text is tilted, it is difficult to recognize the characters in the document. Which of the following techniques can be used for correction in this case?
Maximum likelihood estimation (MLE) requires knowledge of the sample data's distribution type.
In 2017, the Google machine translation team proposed the Transformer in their paperAttention is All You Need. The Transformer consists of an encoder and a(n) --------. (Fill in the blank.)
Maximum likelihood estimation (MLE) can be used for parameter estimation in a Gaussian mixture model (GMM).
The natural language processing field usually uses distributed semantic representation to represent words. Each word is no longer a completely orthogonal 0-1 vector, but a point in a multi-dimensional real number space, which is specifically represented as a real number vector.
Transformer models outperform LSTM when analyzing and processing long-distance dependencies, making them more effective for sequence data processing.
Which of the following are the impacts of the development of large models?
Seq2Seq is a model that translates one sequence into another sequence, essentially consisting of two recurrent neural networks (RNNs), one is the Encoder, and the other is the ---------. (Fill in the blank.)
In an image preprocessing experiment, the cv2.imread("lena.png", 1) function provided by OpenCV is used to read images. The parameter "1" in this function represents a --------- -channel image. (Fill in the blank with a number.)
Which of the following statements about the multi-head attention mechanism of the Transformer are true?
Which of the following statements are true about the differences between using convolutional neural networks (CNNs) in text tasks and image tasks?