You want to write a Python script to create a collection of different projects for your data science team. Which Oracle Cloud Infrastructure (OCI) Data Science Interface would you use?
You have built a machine model to predict whether a bank customer is going to default on a
loan. You want to use Local Interpretable Model-Agnostic Explanations (LIME) to understand a
specific prediction. What is the key idea behind LIME?
You want to write a program that performs document analysis tasks such as extracting text and
tables from a document. Which Oracle AI service would you use?
You are preparing a configuration object necessary to create a Data Flow application. Which THREE parameter values should you provide?
The Oracle AutoML pipeline automates hyperparameter tuning by training the model with different parameters in parallel. You have created an instance of Oracle AutoML as ora-cle_automl and now you want an output with all the different trials performed by Oracle Au-toML. Which of the following command gives you the results of all the trials?
. You realize that your model deployment is about to reach its utilization limit. What would you do
to avoid the issue before requests start to fail?
What preparation steps are required to access an Oracle AI service SDK from a Data Science
notebook session?
As you are working in your notebook session, you find that your notebook session does not have
enough compute CPU and memory for your workload.
How would you scale up your notebook session without losing your work?
As a data scientist, you have stored sensitive data in a database. You need to protect this data by
using a master encryption algorithm, which uses symmetric keys. Which master encryption
algorithm would you choose in the Oracle Cloud Infrastructure (OCI) Vault service?
You are a data scientist building a pipeline in the Oracle Cloud Infrastructure (OCI) Data Science
service for your machine learning project. You want to optimize the pipeline completion time by
running some steps in parallel. Which statement is true about running pipeline steps in parallel?
You want to evaluate the relationship between feature values and target variables. You have a
large number of observations having a near uniform distribution and the features are highly
correlated.
Which model explanation technique should you choose?
You want to ensure that all stdout and stderr from your code are automatically collected and
logged, without implementing additional logging in your code. How would you achieve this with Data
Science Jobs?
You are creating an Oracle Cloud Infrastructure (OCI) Data Science job that will run on a recurring basis in a production environment. This job will pick up sensitive data from an Object Storage bucket, train a model, and save it to the model catalog. How would you design the authentication mechanism for the job?
You are a data scientist trying to load data into your notebook session. You understand that
Accelerated Data Science (ADS) SDK supports loading various data formats.
Which of the following THREE are ADS supported data formats?
Youare a data scientist working for a manufacturing company. You have developed a forecasting
model to predict the sales demand in the upcoming months. You created a model artifact that
contained custom logic requiring third party libraries. When you deployed the model, it failed to run
because you did not include all the third party dependencies in the model artifact. What file should
be modified to include the missing libraries?
You are working as a data scientist for a healthcare company. They decide to analyze the data to
find patterns in a large volume of electronic medical records. You are asked to build a PySpark
solution to analyze these records in a JupyterLab notebook. What is the order of recommended
steps to develop a PySpark application in Oracle Cloud Infrastructure (OCI) Data Science?
You are building a model and need input that represents data as morning, afternoon, or evening.
However, the data contains a time stamp. What part of the Data Science life cycle would you be in
when creating the new variable?
You have just received a new data set from a colleague. You want to quickly find out summary
information about the data set, such as the types of features, the total number of observations, and
distributions of the data. Which Accelerated Data Science (ADS) SDK method from the ADSDataset
class would you use?
Which Oracle Accelerated Data Science (ADS) classes can be used for easy access to data sets from
reference libraries and index websites such as scikit-learn?