30 Cards
ChatGPT
Find the best introduction to ChatGPT with simple flashcards that highlight the important terms related to the working of the revolutionary AI tool.
Back to listLearn ChatGPT concepts
Adversarial Training
Adversarial training is a technique used to improve the robustness of language models through training the models with examples tailored specifically for challenging the model. Subsequently, adversarial training forces the model to learn more robust representations. Adversarial training or machine learning is generally used to execute an attack or cause a malfunction in the machine learning system. You can implement adversarial machine learning as a white box attack or a black box attack.
Attention Mechanism
Attention mechanisms are an important component in deep learning. The input data could be massive and complex in various problems, so it is difficult for the concerned model to handle the processing. Data has to pass through multiple layers of neural networks in deep learning models, where every node ensures data processing. However, passing data through multiple layers could create difficulties in recognizing the relevant information, thereby demanding the development of attention mechanisms.
Beam Search
Beam Search is a popular search algorithm utilized in NLP for the identification of the most suitable sequence of tokens created by a specific language model. It can identify the most probable sequence of tokens by maintaining a beam of most likely sequences at every time step. Beam search has also been described as a heuristic search algorithm that could inspect a graph through an extension of the most favorable node in a limited set. Interestingly, it uses the breadth-first search approach for the creation of its search tree.
Coreference Resolution
It refers to the task of discovering all the expressions which focus on the same entity within a text. Coreference resolution is an important process associated with high-level, complex NLP tasks which require natural language understanding. The NLP tasks that need coreference resolution include information extraction, document summarization, and question answering. You can also understand coreference resolution as the task of clustering different mentions in text, referring to the same entities in the real world.
Evaluation Metrics
Evaluation metrics are the simple entities used for measuring the performance of language models. Some of the notable evaluation metrics for language models include perplexity, BLEU score, accuracy, and F1 score. For example, the Bilingual Evaluation Understudy or BLEU score serves as an important metric for evaluating the quality of machine learning-based translation systems. You can also use other metrics to test a model, such as confusion matrix, classification accuracy, and logarithmic loss.
Dependency Parsing
It is the process of examination of dependencies between different words in a sentence in order to evaluate the grammatical structure. The process involves breaking down a sentence into different components. It operates based on assumptions about a direct link between different linguistic units of a sentence. The process results in an identification of the relationship between all the linguistic units. It also indicates the relations between the linguistic units with directed arcs aligned in a typed dependency structure.
Deep Learning
Deep learning is an important subset associated with machine learning, which practically points at a neural network featuring three or additional layers. The neural networks attempt a simulation of the human brain’s behavior to learn from large volumes of data. Neural networks with a single layer could make almost accurate predictions. Deep learning is responsible for driving multiple AI applications and services, focused on improving automation and addressing other tasks without human intervention.
Byte Pair Encoding
Byte Pair Encoding, or BPE, is an important concept in NLP tokenization, which divides words into different subword units according to their frequency in training data. BPE was initially created as an algorithm for compressing texts. OpenAI utilized BPE for the first time to support tokenization during the pre-training phase of their GPT model. Different transformer models, such as BART and GPT, use BPE tokenization. The BPE training process involves the computation of the unique collection of words used in the training data and develops a vocabulary.
Prompt Engineering
Prompt engineering is an AI engineering technique that helps in refining the outputs of Large Language Models or LLMs. It also refers to the process of using specific prompts to refine the outputs of generative AI services. The continuous improvement of generative AI tools would ensure that prompt engineering is generating different types of content, such as scripts, robotic process automation bots, robot instructions, and 3D assets. It is a powerful AI engineering technique for tuning LLMs to align with desired use cases.
Natural Language Processing
Natural Language Processing, or NLP, is the branch of computer science and AI that focuses on helping computers with the ability to understand text and words like humans. NLP utilizes a combination of computational linguistics with ML and deep learning models. Computational logistics involve rule-based modeling for human language, and the technologies help in processing human language in text or voice data. NLP powers computer programs that translate text into different languages and responds to spoken commands.
Conversational AI
Conversational AI refers to a variant of artificial intelligence or AI capable of simulating human conversation. The primary principle behind the working of conversational AI is Natural Language Processing or NLP, which helps computers understand and process human language. NLP helps in analyzing the meaning of speech and text, followed by generating relevant responses for the conversation. Conversation AI systems use massive volumes of training data, which helps the system understand and process human language.
GPT
Generative Pre-trained Transformer or GPT models refer to AI models that use pre-training data and transform it to generate something new. The term ‘Generative’ in GPT suggests the ability to create new text or speech. Pre-training is a crucial highlight of machine learning, which suggests training GPT on a massive pool of data for independent operations. Finally, GPT AI models use the transformer architecture, which offers an outline of the interconnection between different components of the model.
Fine-tuning
Fine-tuning is an important concept in machine learning and involves the use of the weights of a network that has already been trained. The weights are employed in the form of starting values for the training of a new network. As of now, the existing best practices recommend the use of models pre-trained with large datasets to solve a problem similar to existing ones. Fine-tuning is a helpful process when you don’t have a significant amount of data for a specific task.
Generative Adversarial Networks
Generative Adversarial Networks, or GANs, are AI models tailored for generative modeling through deep learning methods like convolutional neural networks. It is important to note that generative modeling falls in the unsupervised learning category. Generative modeling focuses on automatic discovery and learning about patterns in the input data in order to generate new outputs possible in the original dataset. GANs are an effective tool for training generative models by framing the problem in the form of a supervised learning problem.
Language Modeling
Language modeling refers to the application of different probabilistic and statistical techniques for determining the probability of a specific sequence of words within a sentence. On top of it, language models are capable of analyzing massive volumes of text data to strengthen the foundations of their word predictions. Language models help in NLP tasks such as question answering and machine translation. Language modeling involves the model learning about important features and characteristics associated with basic language.
Greedy Algorithm
A greedy algorithm in AI refers to the approach of resolving an issue by choosing the ideal option available at the moment. It does not check whether the best option right now could ensure an overall optimal outcome. The greedy algorithm would not reverse the earlier decision, even in the case of wrong choices, and follows a top-down approach. The greedy algorithm cannot produce the best result in case of many different problems as it always seeks the current best option.
Neural Network
A neural network is an important advancement in AI, which helps computers learn about the methods for processing data like the human brain. You can also describe a neural network as a variant of a ML process or deep learning, which utilizes interconnected neurons or nodes in different layers. The neural network offers an adaptive system that helps computers in learning from their mistakes and ensures continuous improvements. Therefore, artificial neural networks are capable of addressing more complicated problems, such as face recognition.
Masked Language Modeling
Language modeling refers to the task of tailoring a model according to domain-specific or general training data. Interestingly, you can find language modeling versions such as casual language modeling and masked language modeling that train popular transformer models. Masked language modeling involves masking tokens in a sequence by using masking tokens. Subsequently, masked language modeling involves training the model to fill the mask with relevant tokens. As a result, the model can focus on left as well as right contexts.
Overfitting
Overfitting is one of the unfavorable behaviors in a machine learning model when it offers accurate predictions for the training data only. Data scientists train a specific model on a popular data set in use cases of machine learning models to make predictions. According to the training data in the popular data set, the model would predict results for the new data sets. In the case of an overfit model, you can find inaccurate predictions and formidable issues in performance for different variants of new data.
Model Inference
Model Inference refers to the process of running data points within a machine learning model. It can help in calculating an output like a single numerical score. The process can also be described as the operationalization of a machine-learning model. In simple words, you can define model inference as the task of employing a machine learning model in production. In the case of the BigQuery ML tool, machine learning inference focuses on different machine learning tasks like prediction, generative AI, forecasting, and anomaly detection.
Sentiment Analysis
Sentiment analysis, also known as opinion mining, is a unique approach to Natural Language Processing or NLP, which helps in identifying the emotional tone and context in a body of text. It is a popular method for determining and categorizing opinions about products, services, or new ideas. Sentiment analysis works by using a combination of data mining, computational linguistics, machine learning, and artificial intelligence. The combination of all these technologies helps in mining text to find sentiment and subjective information.
Self-Attention
Self-attention mechanism in transformer architecture involves the comparison of all input sequence members with each other. Subsequently, it involves the modification of corresponding output sequence positions. The self-attention layer uses differentiable key-value searches for the input sequence on every input. It would also add the outcomes to the output sequence. The self-attention mechanism helps inputs interact with each other by calculating the attention of all the inputs with respect to one specific input.
Sequence-to-Sequence Models
Sequence to Sequence or seq2seq models are a special category of recurrent neural network architectures you can choose for machine translation tasks. It can also help in tasks such as image captioning and text summarization. You can find two distinct components in the seq2seq models, such as the encoder and the decoder. The seq2seq models rely on a dataset featuring input-output pairs for training and ensure more possibilities of obtaining the correct output sequence for a concerned input sentence.
Transfer Learning
Transfer learning is a technique in machine learning for training a model which has been trained for a specific task by repurposing it for a second similar task. It has also been defined as an approach for optimization that supports faster progress alongside improvement in performance while modeling the second task. Transfer learning focuses on different problems, such as concept drift and multi-task learning. Most importantly, it is one of the unique approaches in deep learning, which is not restricted to the domain of deep learning only.
Word Embedding
Word embedding refers to the approach used to represent words and documents. It can be described as numeric vector input with the ability to support different words with similar meanings to achieve a single representation. Word embedding could condense the meaning and represent a specific word in the lower dimensional space. As a result, users can leverage the benefits of faster training in comparison to hand-built models which use graph embedding. A word embedding with 30 different values could represent 30 distinct features.
Zero-shot Learning
Zero-shot learning refers to a machine learning technique that helps a model in the classification of objects from classes that were previously unseen. The approach does not rely on receiving any particular training for the unseen classes. Zero-shot learning is an ideal option for autonomous systems that should have the capability of identifying and categorizing new objects individually. Zero-shot learning involves pre-training a model on a specific set of classes, followed by generalization to unseen classes.
Few-shot Learning
Few-shot learning is an important sub-domain of machine learning and focuses on the classification of new data when you utilize only a few specific training samples on the basis of supervised information. The machine learning approach is comparatively new and would need more research and improvements over the course of time. As of now, few-shot learning can serve as an ideal training approach for computer vision tasks. Computer vision models could work with considerably fewer training samples, such as in the case of the healthcare sector.
Named Entity Recognition
Named Entity Recognition, or NER, is a natural language processing task that can help in extracting any desired information from a text. NER focuses on the detection and classification of important information in the text, referred to as named entities. The named entities could be key subjects in a specific text, such as locations, names, themes, monetary values, topics, locations, and time. Named Entity Recognition aims to identify, categorize, and extract significant pieces of data from unstructured text without human intervention.
Cross-validation
Cross-validation is an important technique in machine learning for evaluating the performance of a model for unseen data. It focuses on dividing the available data into different subsets or folds, followed by using one of the folds in the form of a validation set. Subsequently, it trains the model with the remaining folds, and the process can be repeated at different times by using different folds as the validation set every time. In the final step, the average of the results from all the validation steps could help in determining the performance of an AI model.
Feature Extraction
Feature extraction is an important process for dimensionality reduction in AI models by reducing the initial set of raw data to smaller groups for processing. The noticeable trait of large data sets is the significant growth in the number of variables, which requires more computing resources. Feature extraction also refers to the methods for selecting or combining variables into different features. It also ensures an effective reduction in the amount of data for processing while delivering a complete and accurate description of the original dataset.