Prompt engineering: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎top: clarification
Tags: Mobile edit Mobile app edit Android app edit
→‎Textual prompting: clarification of a section's content
Tags: Mobile edit Mobile app edit Android app edit
Line 2: Line 2:
'''Prompt engineering''' (as named from a user's perspective) or '''in-context learning''' (as named from [[machine learning]] perspective) is an approach used in teaching a [[large language model]] (or learning, from its perspective) to do anything that it was not part of its pre-training. The results of learning are, however, contained within a single conversation (i.e. single context window) and completely forgotten by the model outside it. The results of such a temporary kind of learning seems to be in the form of temporary changes in the activations of neurons, i.e. without changing the model's weights (in contrast to the results of model's pre-training and fine-tuning). This is also known as "mesa"-optimization<ref>{{cite web |title=Mesa-Optimization |url=https://www.alignmentforum.org/tag/mesa-optimization |access-date=17 May 2023}}</ref>, based on presence of (small) learn-to-learn models in the data.<ref>{{cite arXiv | eprint=2212.07677 | author1=Johannes von Oswald | last2=Niklasson | first2=Eyvind | last3=Randazzo | first3=Ettore | last4=Sacramento | first4=João | last5=Mordvintsev | first5=Alexander | last6=Zhmoginov | first6=Andrey | last7=Vladymyrov | first7=Max | title=Transformers learn in-context by gradient descent | year=2022 | class=cs.LG }}</ref><ref>{{cite arXiv | eprint=2208.01066 | last1=Garg | first1=Shivam | last2=Tsipras | first2=Dimitris | last3=Liang | first3=Percy | last4=Valiant | first4=Gregory | title=What Can Transformers Learn In-Context? A Case Study of Simple Function Classes | year=2022 | class=cs.CL }}</ref><ref>{{cite arXiv | eprint=2211.15661 | last1=Akyürek | first1=Ekin | last2=Schuurmans | first2=Dale | last3=Andreas | first3=Jacob | last4=Ma | first4=Tengyu | last5=Zhou | first5=Denny | title=What learning algorithm is in-context learning? Investigations with linear models | year=2022 | class=cs.LG }}</ref><ref>{{cite web |last1=Musser |first1=George |title=How AI Knows Things No One Told It |website=[[Scientific American]] |url=https://www.scientificamerican.com/article/how-ai-knows-things-no-one-told-it/ |access-date=17 May 2023}}</ref><ref>{{cite document |first1=Alec |last1=Radford |first2=Jeffrey |last2=Wu |first3=Rewon |last3=Child |first4=David |last4=Luan |first5=Dario |last5=Amodei |first6=Ilya |last6=Sutskever |year=2019 |title=Language Models are Unsupervised Multitask Learners |url=https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf}}</ref><ref>{{cite arXiv | eprint=2107.13586 | last1=Liu | first1=Pengfei | last2=Yuan | first2=Weizhe | last3=Fu | first3=Jinlan | last4=Jiang | first4=Zhengbao | last5=Hayashi | first5=Hiroaki | last6=Neubig | first6=Graham | title=Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing | year=2021 | class=cs.CL }}</ref>
'''Prompt engineering''' (as named from a user's perspective) or '''in-context learning''' (as named from [[machine learning]] perspective) is an approach used in teaching a [[large language model]] (or learning, from its perspective) to do anything that it was not part of its pre-training. The results of learning are, however, contained within a single conversation (i.e. single context window) and completely forgotten by the model outside it. The results of such a temporary kind of learning seems to be in the form of temporary changes in the activations of neurons, i.e. without changing the model's weights (in contrast to the results of model's pre-training and fine-tuning). This is also known as "mesa"-optimization<ref>{{cite web |title=Mesa-Optimization |url=https://www.alignmentforum.org/tag/mesa-optimization |access-date=17 May 2023}}</ref>, based on presence of (small) learn-to-learn models in the data.<ref>{{cite arXiv | eprint=2212.07677 | author1=Johannes von Oswald | last2=Niklasson | first2=Eyvind | last3=Randazzo | first3=Ettore | last4=Sacramento | first4=João | last5=Mordvintsev | first5=Alexander | last6=Zhmoginov | first6=Andrey | last7=Vladymyrov | first7=Max | title=Transformers learn in-context by gradient descent | year=2022 | class=cs.LG }}</ref><ref>{{cite arXiv | eprint=2208.01066 | last1=Garg | first1=Shivam | last2=Tsipras | first2=Dimitris | last3=Liang | first3=Percy | last4=Valiant | first4=Gregory | title=What Can Transformers Learn In-Context? A Case Study of Simple Function Classes | year=2022 | class=cs.CL }}</ref><ref>{{cite arXiv | eprint=2211.15661 | last1=Akyürek | first1=Ekin | last2=Schuurmans | first2=Dale | last3=Andreas | first3=Jacob | last4=Ma | first4=Tengyu | last5=Zhou | first5=Denny | title=What learning algorithm is in-context learning? Investigations with linear models | year=2022 | class=cs.LG }}</ref><ref>{{cite web |last1=Musser |first1=George |title=How AI Knows Things No One Told It |website=[[Scientific American]] |url=https://www.scientificamerican.com/article/how-ai-knows-things-no-one-told-it/ |access-date=17 May 2023}}</ref><ref>{{cite document |first1=Alec |last1=Radford |first2=Jeffrey |last2=Wu |first3=Rewon |last3=Child |first4=David |last4=Luan |first5=Dario |last5=Amodei |first6=Ilya |last6=Sutskever |year=2019 |title=Language Models are Unsupervised Multitask Learners |url=https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf}}</ref><ref>{{cite arXiv | eprint=2107.13586 | last1=Liu | first1=Pengfei | last2=Yuan | first2=Weizhe | last3=Fu | first3=Jinlan | last4=Jiang | first4=Zhengbao | last5=Hayashi | first5=Hiroaki | last6=Neubig | first6=Graham | title=Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing | year=2021 | class=cs.CL }}</ref>


== Textual prompting ==
== LLM prompting techniques ==


=== Chain-of-thought ===
=== Chain-of-thought technique ===
Method called ''Chain-of-thought'' (CoT) prompting is intended for solving both multiple-step and [[logical reasoning|logical-thinking]] tasks, such as [[arithmetic]] or [[commonsense reasoning]], that require a series of intermediate steps ''before'' giving the final answer to a multi-step problem.<ref>{{cite arXiv | eprint=2201.11903 | last1=Wei | first1=Jason | last2=Wang | first2=Xuezhi | last3=Schuurmans | first3=Dale | last4=Bosma | first4=Maarten | last5=Ichter | first5=Brian | last6=Xia | first6=Fei | last7=Chi | first7=Ed | last8=Le | first8=Quoc | last9=Zhou | first9=Denny | title=Chain-of-Thought Prompting Elicits Reasoning in Large Language Models | year=2022 | class=cs.CL }}</ref><ref>{{cite book |last1=Liu |first1=Vivian |last2= Chilton |first2= Lydia |title=Design Guidelines for Prompt Engineering Text-to-Image Generative Models |url=https://dl.acm.org/doi/abs/10.1145/3491102.3501825 |website=ACM Digital Library |year=2022 |pages=1–23 |publisher=Association for Computing Machinery |doi=10.1145/3491102.3501825 |arxiv=2109.06977 |isbn=9781450391573 |s2cid=237513697 |access-date=26 October 2022}}</ref><ref name="weipaper"/><ref>{{Cite tweet |user=Google| number=1525188695875366912 |title=Pathways Language Model (PaLM) is a new advanced AI model that uses a technique called chain of thought prompting to do complex tasks like solve math word problems — and even explain its reasoning process step-by-step. #GoogleIO}}</ref><ref>{{cite web |last1=McAuliffe |first1=Zachary |title=Google's Latest AI Model Can Be Taught How to Solve Problems |url=https://www.cnet.com/tech/services-and-software/googles-latest-ai-model-can-be-taught-how-to-solve-problems/ |website=CNET |access-date=10 March 2023 |language=en}}</ref><ref>{{cite web |last1=Dang |first1=Ekta |title=Harnessing the power of GPT-3 in scientific research |url=https://venturebeat.com/ai/harnessing-the-power-of-gpt-3-in-scientific-research/ |website=VentureBeat |access-date=10 March 2023 |date=8 February 2023}}</ref><ref>{{cite web |last1=Montti |first1=Roger |title=Google's Chain of Thought Prompting Can Boost Today's Best Algorithms |url=https://www.searchenginejournal.com/google-chain-of-thought-prompting/450106/ |website=Search Engine Journal |access-date=10 March 2023 |language=en |date=13 May 2022}}</ref><ref>{{cite web |last1=Ray |first1=Tiernan |title=Amazon's Alexa scientists demonstrate bigger AI isn't always better |url=https://www.zdnet.com/article/amazons-alexa-scientists-demonstrate-bigger-ai-isnt-always-better/ |website=ZDNET |access-date=10 March 2023 |language=en}}</ref> To address this challenge, it was first proposed by [[Google]] researchers in 2022.<ref name="weipaper">{{cite arXiv |last1=Wei |first1=Jason |last2=Wang |first2=Xuezhi |last3=Schuurmans |first3=Dale |last4=Bosma |first4=Maarten |last5=Ichter |first5=Brian |last6=Xia |first6=Fei |last7=Chi |first7=Ed H. |last8=Le |first8=Quoc V. |last9=Zhou |first9=Denny |title=Chain-of-Thought Prompting Elicits Reasoning in Large Language Models |date=31 October 2022 |class=cs.CL |eprint=2201.11903 |language=en}}</ref><ref>{{cite web |last1=Wei |first1=Jason |last2=Zhou |title=Language Models Perform Reasoning via Chain of Thought |url=https://ai.googleblog.com/2022/05/language-models-perform-reasoning-via.html |website=ai.googleblog.com |date=11 May 2022 |access-date=10 March 2023 |language=en}}</ref>
The ''Chain-of-thought'' (CoT) prompting technique is intended for solving both multiple-step and [[logical reasoning|logical-thinking]] tasks, such as [[arithmetic]] or [[commonsense reasoning]], that require a series of intermediate steps ''before'' giving the final answer to a multi-step problem.<ref>{{cite arXiv | eprint=2201.11903 | last1=Wei | first1=Jason | last2=Wang | first2=Xuezhi | last3=Schuurmans | first3=Dale | last4=Bosma | first4=Maarten | last5=Ichter | first5=Brian | last6=Xia | first6=Fei | last7=Chi | first7=Ed | last8=Le | first8=Quoc | last9=Zhou | first9=Denny | title=Chain-of-Thought Prompting Elicits Reasoning in Large Language Models | year=2022 | class=cs.CL }}</ref><ref>{{cite book |last1=Liu |first1=Vivian |last2= Chilton |first2= Lydia |title=Design Guidelines for Prompt Engineering Text-to-Image Generative Models |url=https://dl.acm.org/doi/abs/10.1145/3491102.3501825 |website=ACM Digital Library |year=2022 |pages=1–23 |publisher=Association for Computing Machinery |doi=10.1145/3491102.3501825 |arxiv=2109.06977 |isbn=9781450391573 |s2cid=237513697 |access-date=26 October 2022}}</ref><ref name="weipaper"/><ref>{{Cite tweet |user=Google| number=1525188695875366912 |title=Pathways Language Model (PaLM) is a new advanced AI model that uses a technique called chain of thought prompting to do complex tasks like solve math word problems — and even explain its reasoning process step-by-step. #GoogleIO}}</ref><ref>{{cite web |last1=McAuliffe |first1=Zachary |title=Google's Latest AI Model Can Be Taught How to Solve Problems |url=https://www.cnet.com/tech/services-and-software/googles-latest-ai-model-can-be-taught-how-to-solve-problems/ |website=CNET |access-date=10 March 2023 |language=en}}</ref><ref>{{cite web |last1=Dang |first1=Ekta |title=Harnessing the power of GPT-3 in scientific research |url=https://venturebeat.com/ai/harnessing-the-power-of-gpt-3-in-scientific-research/ |website=VentureBeat |access-date=10 March 2023 |date=8 February 2023}}</ref><ref>{{cite web |last1=Montti |first1=Roger |title=Google's Chain of Thought Prompting Can Boost Today's Best Algorithms |url=https://www.searchenginejournal.com/google-chain-of-thought-prompting/450106/ |website=Search Engine Journal |access-date=10 March 2023 |language=en |date=13 May 2022}}</ref><ref>{{cite web |last1=Ray |first1=Tiernan |title=Amazon's Alexa scientists demonstrate bigger AI isn't always better |url=https://www.zdnet.com/article/amazons-alexa-scientists-demonstrate-bigger-ai-isnt-always-better/ |website=ZDNET |access-date=10 March 2023 |language=en}}</ref> To address this challenge, it was first proposed by [[Google]] researchers in 2022.<ref name="weipaper">{{cite arXiv |last1=Wei |first1=Jason |last2=Wang |first2=Xuezhi |last3=Schuurmans |first3=Dale |last4=Bosma |first4=Maarten |last5=Ichter |first5=Brian |last6=Xia |first6=Fei |last7=Chi |first7=Ed H. |last8=Le |first8=Quoc V. |last9=Zhou |first9=Denny |title=Chain-of-Thought Prompting Elicits Reasoning in Large Language Models |date=31 October 2022 |class=cs.CL |eprint=2201.11903 |language=en}}</ref><ref>{{cite web |last1=Wei |first1=Jason |last2=Zhou |title=Language Models Perform Reasoning via Chain of Thought |url=https://ai.googleblog.com/2022/05/language-models-perform-reasoning-via.html |website=ai.googleblog.com |date=11 May 2022 |access-date=10 March 2023 |language=en}}</ref>


For example, given the question “Q: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?”, a CoT prompt might induce the LLM to answer with steps of reasoning that mimic a [[train of thought]] like “A: The cafeteria had 23 apples originally. They used 20 to make lunch. So they had 23 - 20 = 3. They bought 6 more apples, so they have 3 + 6 = 9. The answer is 9.”<ref name="weipaper"/>
If user asks a model a question and would like the model to provide step-by-step reasoning behind its answer to the question such as “The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?”, the user needs to prepend "Let's do it step-by-step" to the question in order to get the answer similar to the following one, “The cafeteria had 23 apples originally. They used 20 to make lunch. So they had 23 - 20 = 3. They bought 6 more apples, so they have 3 + 6 = 9. The answer is 9.”<ref name="weipaper"/>


==== Single-prompt solutions and multiple-prompt solutions ====
==== Single-prompt solutions and multiple-prompt solutions ====

Revision as of 16:12, 22 July 2023

Prompt engineering (as named from a user's perspective) or in-context learning (as named from machine learning perspective) is an approach used in teaching a large language model (or learning, from its perspective) to do anything that it was not part of its pre-training. The results of learning are, however, contained within a single conversation (i.e. single context window) and completely forgotten by the model outside it. The results of such a temporary kind of learning seems to be in the form of temporary changes in the activations of neurons, i.e. without changing the model's weights (in contrast to the results of model's pre-training and fine-tuning). This is also known as "mesa"-optimization[1], based on presence of (small) learn-to-learn models in the data.[2][3][4][5][6][7]

LLM prompting techniques

Chain-of-thought technique

The Chain-of-thought (CoT) prompting technique is intended for solving both multiple-step and logical-thinking tasks, such as arithmetic or commonsense reasoning, that require a series of intermediate steps before giving the final answer to a multi-step problem.[8][9][10][11][12][13][14][15] To address this challenge, it was first proposed by Google researchers in 2022.[10][16]

If user asks a model a question and would like the model to provide step-by-step reasoning behind its answer to the question such as “The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?”, the user needs to prepend "Let's do it step-by-step" to the question in order to get the answer similar to the following one, “The cafeteria had 23 apples originally. They used 20 to make lunch. So they had 23 - 20 = 3. They bought 6 more apples, so they have 3 + 6 = 9. The answer is 9.”[10]

Single-prompt solutions and multiple-prompt solutions

Standard word-crafting attempts to use a 'perfect' prompt can be replaced by the following single-prompt and multiple-prompt solutions[17]

The single-prompt solutions include:

  • prepending examples (i.e. few-shot CoT prompting) [Note 1] and
  • adding explicit instruction such as "Let's think step-by-step" (i.e. zero-shot CoT prompting).[19]

The multiple-prompt solutions include two groups, one using a single conversation and the other using multiple conversations. The single-conversation solutions include:

  • Generated knowledge prompting first generates a list of most relevant facts related to a problem to be solved, then - after receiving the knowledge - prompts again with a question based on the received knowledge. The completion quality is usually higher, as the model can be conditioned on relevant facts.[20]
  • Self-refine prompting[21] first generates an initial solution, then - after receiving it - prompts again to generate its critique, then - after receiving the critique - prompts to address its own critique by generating a refined solution. A self-refining conversation stops when it runs out of tokens, time, or with a "stop" token.
  • Maieutic (Socratic) prompting first generates an explanation of the phenomena related to a problem to be solved, then - after receiving the explanation - prompts again to generate additional explanation for the least understood parts of explanation. Inconsistent explanation paths (trees) are pruned or discarded, improving complex commonsense reasoning.[22]
  • Tree-of-thought prompting first generates "possible next steps" for solving a problem, then - after receiving them - prompts again to run each of the proposed possible steps, using a breadth-first, beam, or some other method of tree search.[23][24]
  • Least-to-most prompting first generates the sub-problems to a problem to be solved, then - after receiving the sub-problems - prompts again to solve those in sequence, such that later sub-problems can be solved with the help of answers to previous sub-problems.[25]

The multiple-conversation solutions include:

  • Self-consistency prompting[26] first generates multiple separate conversations with the same prompt, then - after completing all of them - the most commonly reached conclusion is selected. If the results differ a lot, correction of the prompt is needed.[27]
  • Complexity-based prompting[28] first generates multiple separate conversations with the same prompt, then - after completing all of them - the rollouts with the longest chains of thought are selected, then the most commonly reached conclusion out of those is selected.

CoT prompting is an emergent property of model scale,[29][30][10][31][32] which improves the performance on both arithmetic and commonsense tasks, compared to standard word-crafting attempts to use a 'perfect' prompt.[33][34][35] When applied to PaLM, a 540B parameter language model, it has allowed the model to perform comparably with task-specific fine-tuned models on several tasks, even setting a new state of the art at the time on the GSM8K mathematical reasoning benchmark.[10]

Prompting to disclose uncertainty

By default, the output of language models may not contain estimates of uncertainty. The model may output text that appears confident, though the underlying token predictions have low likelihood scores. Large language models like GPT-4 can have accurately calibrated likelihood scores in their token predictions,[36] and so the model output uncertainty can be directly estimated by reading out the token prediction likelihood scores.

But if one cannot access such scores (such as when one is accessing the model through a restrictive API), uncertainty can still be estimated and incorporated into the model output. One simple method is to prompt the model to use words to estimate uncertainty. Another is to prompt the model to refuse to answer in a standardized way if the input does not satisfy conditions.[citation needed]

Automatic prompt generation

Retrieval-augmented generation

Prompts often contain a few examples (thus "few-shot"). Examples can be automatically retrieved from a database with document retrieval, sometimes using a vector database. Given a query, a document retriever is called to retrieve the most relevant (usually measured by first encoding the query and the documents into vectors, then finding the documents with vectors closest in Euclidean norm to the query vector). The LLM then generates an output based on both the query and the retrieved documents.[37]

Using LLM to generate prompts

LLM themselves can be used to compose prompts for LLM.

The automatic prompt engineer algorithm uses one LLM to beam search over prompts for another LLM:[38]

  • There are two LLMs. One is the target LLM, and another is the prompting LLM.
  • Prompting LLM is presented with example input-output pairs, and asked to generate instructions that could have caused a model following the instructions to generate the outputs, given the inputs.
  • Each of the generated instructions is used to prompt the target LLM, followed by each of the inputs. The log-probabilities of the outputs are computed and added. This is the score of the instruction.
  • The highest-scored instructions are given to the prompting LLM for further variations.
  • Repeat until some stopping criteria is reached, then output the highest-scored instructions.

CoT examples can be generated by LLM themselves. In "auto-CoT",[39] a library of questions are converted to vectors by a model such as BERT. The question vectors are clustered. Questions nearest to the centroids of each cluster are selected. An LLM does zero-shot CoT on each question. The resulting CoT examples are added to the dataset. When prompted with a new question, CoT examples to the nearest questions can be retrieved and added to the prompt.

Using gradient descent to search for prompts

Prompts do not have to be in English. In "prefix-tuning" or "prompt tuning", floating-point-valued vectors are searched directly by gradient descent, to maximize the log-probability on outputs.[40][41]

Formally, let be a set of soft prompt tokens (tunable embeddings), while and be the token embeddings of the input and output respectively. During training, the tunable embeddings, input, and output tokens are concatenated into a single sequence, and fed to the large language models (LLM). The losses are computed over the tokens; the gradients are backpropagated to prompt-specific parameters: in prefix-tuning, they are parameters associated with the prompt tokens at each layer; in prompt tuning, they are merely the soft tokens added to the vocabulary.[42]

With more math details, let an LLM be written as , where is a sequence of linguistic tokens, is the token-to-vector function, and is the rest of the model. In prefix-tuning, one provide a set of input-output pairs , and then use gradient descent to search for . In words, is the log-likelihood of outputting , if the model first encodes the input into the vector , then prepend the vector with the "prefix vector" , then apply .

An earlier result[43] uses the same idea of gradient descent search, but is designed for masked language models like BERT, and searches only over token sequences, rather than numerical vectors. Formally, it searches for where is ranges over token sequences of a specified length.

History

The GPT-2 and GPT-3 language models were important steps in prompt engineering: Trained on large amounts of text using deep learning methods, they attained the capability of generating output that resembles human-generated text.[44]

In 2021, researchers finetuned one generatively pretrained model (T0) on performing 12 NLP tasks (using 62 datasets, as each task can have multiple datasets) that showed good performance on new tasks, surpassing models trained directly on just performing one task (without pretraining). To solve a task, T0 is given the task in a structured prompt, for example If {{premise}} is true, is it also true that {{hypothesis}}? ||| {{entailed}}. is the prompt used for making T0 solve entailment.[45]

A repository for prompts reported that over 2,000 public prompts for around 170 datasets were available in February 2022.[46]

Non-textual prompting

Text-to-image

In 2022, machine learning (ML) models like DALL-E 2, Stable Diffusion, and Midjourney were released to the public. These models take text prompts as input and use them to generate images, which introduced a new category of prompt engineering related to text-to-image prompting.[47]

Image prompting

In 2023, Meta's AI research released Segment Anything, a prompting-based model for creating masks for objects in images. It can be prompted in several ways, such as selecting a few "positive" and "negative" points, to create a mask that includes all the positive points and excludes all the negative points.[48]

Malicious

Prompt injection is a family of related computer security exploits carried out by getting a machine learning model (such as an LLM) which was trained to follow human-given instructions to follow instructions provided by a malicious user. This stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompts) provided by the ML model's operator.[49][50][51]

Common types of prompt injection attacks are:

  • jailbreaking, which may include asking the model to roleplay a character, to answer with arguments, or to pretend to be superior to moderation instructions[52]
  • prompt leaking, in which users persuade the model to divulge a pre-prompt which is normally hidden from users[53]
  • token smuggling, is another type of jailbreaking attack, in which the nefarious prompt is wrapped in a code writing task.[54]

Prompt injection can be viewed as a code injection attack using adversarial prompt engineering. In 2022, the NCC Group characterized prompt injection as a new class of vulnerability of AI/ML systems.[55]

In early 2023, prompt injection was seen "in the wild" in minor exploits against ChatGPT, Bard, and similar chatbots, for example to reveal the hidden initial prompts of the systems,[56] or to trick the chatbot into participating in conversations that violate the chatbot's content policy.[57] One of these prompts was known as "Do Anything Now" (DAN) by its practitioners.[58]

For LLM that can query online resources, such as websites, they can be targeted for prompt injection by placing the prompt on a website, then prompt the LLM to visit the website.[59][60] Another security issue is in LLM generated code, which may import packages not previously existing. An attacker can first prompt the LLM with commonly used programming prompts, collect all packages imported by the generated programs, then find the ones not existing on the official registry. Then the attacker can create such packages with malicious payload and upload them to the official registry.[61]

See also

External links

Notes

  1. ^ If examples include hate speech, the likelihood of toxic output is increased.[18]

References

  1. ^ "Mesa-Optimization". Retrieved 17 May 2023.
  2. ^ Johannes von Oswald; Niklasson, Eyvind; Randazzo, Ettore; Sacramento, João; Mordvintsev, Alexander; Zhmoginov, Andrey; Vladymyrov, Max (2022). "Transformers learn in-context by gradient descent". arXiv:2212.07677 [cs.LG].
  3. ^ Garg, Shivam; Tsipras, Dimitris; Liang, Percy; Valiant, Gregory (2022). "What Can Transformers Learn In-Context? A Case Study of Simple Function Classes". arXiv:2208.01066 [cs.CL].
  4. ^ Akyürek, Ekin; Schuurmans, Dale; Andreas, Jacob; Ma, Tengyu; Zhou, Denny (2022). "What learning algorithm is in-context learning? Investigations with linear models". arXiv:2211.15661 [cs.LG].
  5. ^ Musser, George. "How AI Knows Things No One Told It". Scientific American. Retrieved 17 May 2023.
  6. ^ Radford, Alec; Wu, Jeffrey; Child, Rewon; Luan, David; Amodei, Dario; Sutskever, Ilya (2019). "Language Models are Unsupervised Multitask Learners" (Document). {{cite document}}: Cite document requires |publisher= (help); Unknown parameter |url= ignored (help)
  7. ^ Liu, Pengfei; Yuan, Weizhe; Fu, Jinlan; Jiang, Zhengbao; Hayashi, Hiroaki; Neubig, Graham (2021). "Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing". arXiv:2107.13586 [cs.CL].
  8. ^ Wei, Jason; Wang, Xuezhi; Schuurmans, Dale; Bosma, Maarten; Ichter, Brian; Xia, Fei; Chi, Ed; Le, Quoc; Zhou, Denny (2022). "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models". arXiv:2201.11903 [cs.CL].
  9. ^ Liu, Vivian; Chilton, Lydia (2022). Design Guidelines for Prompt Engineering Text-to-Image Generative Models. Association for Computing Machinery. pp. 1–23. arXiv:2109.06977. doi:10.1145/3491102.3501825. ISBN 9781450391573. S2CID 237513697. Retrieved 26 October 2022. {{cite book}}: |website= ignored (help)
  10. ^ a b c d e Wei, Jason; Wang, Xuezhi; Schuurmans, Dale; Bosma, Maarten; Ichter, Brian; Xia, Fei; Chi, Ed H.; Le, Quoc V.; Zhou, Denny (31 October 2022). "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models". arXiv:2201.11903 [cs.CL].
  11. ^ @Google (May 13, 2022). "Pathways Language Model (PaLM) is a new advanced AI model that uses a technique called chain of thought prompting to do complex tasks like solve math word problems — and even explain its reasoning process step-by-step. #GoogleIO" (Tweet) – via Twitter.
  12. ^ McAuliffe, Zachary. "Google's Latest AI Model Can Be Taught How to Solve Problems". CNET. Retrieved 10 March 2023.
  13. ^ Dang, Ekta (8 February 2023). "Harnessing the power of GPT-3 in scientific research". VentureBeat. Retrieved 10 March 2023.
  14. ^ Montti, Roger (13 May 2022). "Google's Chain of Thought Prompting Can Boost Today's Best Algorithms". Search Engine Journal. Retrieved 10 March 2023.
  15. ^ Ray, Tiernan. "Amazon's Alexa scientists demonstrate bigger AI isn't always better". ZDNET. Retrieved 10 March 2023.
  16. ^ Wei, Jason; Zhou (11 May 2022). "Language Models Perform Reasoning via Chain of Thought". ai.googleblog.com. Retrieved 10 March 2023.
  17. ^ Dickson, Ben (30 August 2022). "LLMs have not learned our language — we're trying to learn theirs". VentureBeat. Retrieved 10 March 2023.
  18. ^ Shaikh, Omar; Zhang, Hongxin; Held, William; Bernstein, Michael; Yang, Diyi (2022). "On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning". arXiv:2212.08061 [cs.CL].
  19. ^ Kojima, Takeshi; Shixiang Shane Gu; Reid, Machel; Matsuo, Yutaka; Iwasawa, Yusuke (2022). "Large Language Models are Zero-Shot Reasoners". arXiv:2205.11916 [cs.CL].
  20. ^ Liu, Jiacheng; Liu, Alisa; Lu, Ximing; Welleck, Sean; West, Peter; Le Bras, Ronan; Choi, Yejin; Hajishirzi, Hannaneh (May 2022). "Generated Knowledge Prompting for Commonsense Reasoning". Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin, Ireland: Association for Computational Linguistics: 3154–3169. doi:10.18653/v1/2022.acl-long.225. S2CID 239016123.
  21. ^ Madaan, Aman; Tandon, Niket; Gupta, Prakhar; Hallinan, Skyler; Gao, Luyu; Wiegreffe, Sarah; Alon, Uri; Dziri, Nouha; Prabhumoye, Shrimai; Yang, Yiming; Gupta, Shashank; Prasad Majumder, Bodhisattwa; Hermann, Katherine; Welleck, Sean; Yazdanbakhsh, Amir (2023-03-01). "Self-Refine: Iterative Refinement with Self-Feedback". arXiv:2303.17651. {{cite journal}}: Cite journal requires |journal= (help)
  22. ^ Jung, Jaehun; Qin, Lianhui; Welleck, Sean; Brahman, Faeze; Bhagavatula, Chandra; Le Bras, Ronan; Choi, Yejin (2022-05-01). "Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations". arXiv:2205.11822. {{cite journal}}: Cite journal requires |journal= (help)
  23. ^ Long, Jieyi (2023-05-15). "Large Language Model Guided Tree-of-Thought". arXiv:2305.08291 [cs.AI].
  24. ^ Yao, Shunyu; Yu, Dian; Zhao, Jeffrey; Shafran, Izhak; Griffiths, Thomas L.; Cao, Yuan; Narasimhan, Karthik (2023-05-17). "Tree of Thoughts: Deliberate Problem Solving with Large Language Models". arXiv:2305.10601 [cs.CL].
  25. ^ Zhou, Denny; Schärli, Nathanael; Hou, Le; Wei, Jason; Scales, Nathan; Wang, Xuezhi; Schuurmans, Dale; Cui, Claire; Bousquet, Olivier; Le, Quoc; Chi, Ed (2022-05-01). "Least-to-Most Prompting Enables Complex Reasoning in Large Language Models". arXiv:2205.10625. {{cite journal}}: Cite journal requires |journal= (help)
  26. ^ Wang, Xuezhi; Wei, Jason; Schuurmans, Dale; Le, Quoc; Chi, Ed; Narang, Sharan; Chowdhery, Aakanksha; Zhou, Denny (2022-03-01). "Self-Consistency Improves Chain of Thought Reasoning in Language Models". arXiv:2203.11171 [cs.CL].
  27. ^ Diao, Shizhe; Wang, Pengcheng; Lin, Yong; Zhang, Tong (2023-02-01). "Active Prompting with Chain-of-Thought for Large Language Models". arXiv:2302.12246 [cs.CL].
  28. ^ Fu, Yao; Peng, Hao; Sabharwal, Ashish; Clark, Peter; Khot, Tushar (2022-10-01). "Complexity-Based Prompting for Multi-Step Reasoning". arXiv:2210.00720. {{cite journal}}: Cite journal requires |journal= (help)
  29. ^ Caballero, Ethan; Gupta, Kshitij; Rish, Irina; Krueger, David (2022). "Broken Neural Scaling Laws". International Conference on Learning Representations (ICLR), 2023.
  30. ^ Wei, Jason; Tay, Yi; Bommasani, Rishi; Raffel, Colin; Zoph, Barret; Borgeaud, Sebastian; Yogatama, Dani; Bosma, Maarten; Zhou, Denny; Metzler, Donald; Chi, Ed H.; Hashimoto, Tatsunori; Vinyals, Oriol; Liang, Percy; Dean, Jeff; Fedus, William (31 August 2022). "Emergent Abilities of Large Language Models". arXiv:2206.07682 [cs.CL].
  31. ^ Chung, Hyung Won; Hou, Le; Longpre, Shayne; Zoph, Barret; Tay, Yi; Fedus, William; Li; Wang, Xuezhi; Dehghani, Mostafa; Brahma, Siddhartha; Webson, Albert; Gu, Shixiang Shane; Dai, Zhuyun; Suzgun, Mirac; Chen, Xinyun; Chowdhery, Aakanksha; Castro-Ros, Alex; Pellat, Marie; Robinson, Kevin; Valter, Dasha; Narang, Sharan; Mishra, Gaurav; Yu, Adams; Zhao, Vincent; Huang, Yanping; Dai, Andrew; Yu, Hongkun; Petrov, Slav; Chi, Ed H.; Dean, Jeff; Devlin, Jacob; Roberts, Adam; Zhou, Denny; Le, Quoc V.; Wei, Jason (2022). "Scaling Instruction-Finetuned Language Models". arXiv:2210.11416 [cs.LG].
  32. ^ Wei, Jason; Tay, Yi (29 November 2022). "Better Language Models Without Massive Compute". ai.googleblog.com. Retrieved 10 March 2023.
  33. ^ Stokel-Walker, Chris. "AIs become smarter if you tell them to think step by step". newscientist.com. Retrieved 5 Jun 2023.
  34. ^ "Google & Stanford Team Applies Chain-of-Thought Prompting to Surpass Human Performance on Challenging BIG-Bench Tasks | Synced". syncedreview.com. 24 October 2022. Retrieved 10 March 2023.
  35. ^ "Google I/O 2022: Advancing knowledge and computing". Google. 11 May 2022. Retrieved 10 March 2023.
  36. ^ OpenAI (2023-03-27). "GPT-4 Technical Report". arXiv:2303.08774 [cs.CL]. [See Figure 8.]
  37. ^ Lewis, Patrick; Perez, Ethan; Piktus, Aleksandra; Petroni, Fabio; Karpukhin, Vladimir; Goyal, Naman; Küttler, Heinrich; Lewis, Mike; Yih, Wen-tau; Rocktäschel, Tim; Riedel, Sebastian; Kiela, Douwe (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks". Advances in Neural Information Processing Systems. 33. Curran Associates, Inc.: 9459–9474. arXiv:2005.11401.
  38. ^ Zhou, Yongchao; Ioan Muresanu, Andrei; Han, Ziwen; Paster, Keiran; Pitis, Silviu; Chan, Harris; Ba, Jimmy (2022-11-01). "Large Language Models Are Human-Level Prompt Engineers". arXiv:2211.01910 [cs.LG].
  39. ^ Zhang, Zhuosheng; Zhang, Aston; Li, Mu; Smola, Alex (2022-10-01). "Automatic Chain of Thought Prompting in Large Language Models". arXiv:2210.03493 [cs.CL].
  40. ^ Li, Xiang Lisa; Liang, Percy (2021). "Prefix-Tuning: Optimizing Continuous Prompts for Generation". Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). pp. 4582–4597. doi:10.18653/V1/2021.ACL-LONG.353. S2CID 230433941.
  41. ^ Lester, Brian; Al-Rfou, Rami; Constant, Noah (2021). "The Power of Scale for Parameter-Efficient Prompt Tuning". Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. pp. 3045–3059. arXiv:2104.08691. doi:10.18653/V1/2021.EMNLP-MAIN.243. S2CID 233296808.
  42. ^ Sun, Simeng; Liu, Yang; Iter, Dan; Zhu, Chenguang; Iyyer, Mohit (2023). "How Does In-Context Learning Help Prompt Tuning?". arXiv:2302.11521 [cs.CL].
  43. ^ Shin, Taylor; Razeghi, Yasaman; Logan IV, Robert L.; Wallace, Eric; Singh, Sameer (November 2020). "AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts". Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online: Association for Computational Linguistics: 4222–4235. doi:10.18653/v1/2020.emnlp-main.346. S2CID 226222232.
  44. ^ Brown, Tom B.; et al. (2020). "Language Models are Few-Shot Learners". arXiv:2005.14165 [cs.CL].
  45. ^ Sanh, Victor; et al. (2021). "Multitask Prompted Training Enables Zero-Shot Task Generalization". arXiv:2110.08207 [cs.LG].
  46. ^ Bach, Stephen H.; Sanh, Victor; Yong, Zheng-Xin; Webson, Albert; Raffel, Colin; Nayak, Nihal V.; Sharma, Abheesht; Kim, Taewoon; M Saiful Bari; Fevry, Thibault; Alyafeai, Zaid; Dey, Manan; Santilli, Andrea; Sun, Zhiqing; Ben-David, Srulik; Xu, Canwen; Chhablani, Gunjan; Wang, Han; Jason Alan Fries; Al-shaibani, Maged S.; Sharma, Shanya; Thakker, Urmish; Almubarak, Khalid; Tang, Xiangru; Radev, Dragomir; Mike Tian-Jian Jiang; Rush, Alexander M. (2022). "PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts". arXiv:2202.01279 [cs.LG].
  47. ^ Monge, Jim Clyde (2022-08-25). "Dall-E2 VS Stable Diffusion: Same Prompt, Different Results". MLearning.ai. Retrieved 2022-08-31.
  48. ^ Kirillov, Alexander; Mintun, Eric; Ravi, Nikhila; Mao, Hanzi; Rolland, Chloe; Gustafson, Laura; Xiao, Tete; Whitehead, Spencer; Berg, Alexander C.; Lo, Wan-Yen; Dollár, Piotr; Girshick, Ross (2023-04-01). "Segment Anything". arXiv:2304.02643 [cs.CV].
  49. ^ Willison, Simon (12 September 2022). "Prompt injection attacks against GPT-3". simonwillison.net. Retrieved 2023-02-09.
  50. ^ Papp, Donald (2022-09-17). "What's Old Is New Again: GPT-3 Prompt Injection Attack Affects AI". Hackaday. Retrieved 2023-02-09.
  51. ^ Vigliarolo, Brandon (19 September 2022). "GPT-3 'prompt injection' attack causes bot bad manners". www.theregister.com. Retrieved 2023-02-09.
  52. ^ "🟢 Jailbreaking | Learn Prompting".
  53. ^ "🟢 Prompt Leaking | Learn Prompting".
  54. ^ Xiang, Chloe (March 22, 2023). "The Amateurs Jailbreaking GPT Say They're Preventing a Closed-Source AI Dystopia". www.vice.com. Retrieved 2023-04-04.
  55. ^ Selvi, Jose (2022-12-05). "Exploring Prompt Injection Attacks". NCC Group Research Blog. Retrieved 2023-02-09.
  56. ^ Edwards, Benj (14 February 2023). "AI-powered Bing Chat loses its mind when fed Ars Technica article". Ars Technica. Retrieved 16 February 2023.
  57. ^ "The clever trick that turns ChatGPT into its evil twin". Washington Post. 2023. Retrieved 16 February 2023.
  58. ^ Perrigo, Billy (17 February 2023). "Bing's AI Is Threatening Users. That's No Laughing Matter". Time. Retrieved 15 March 2023.
  59. ^ Xiang, Chloe (2023-03-03). "Hackers Can Turn Bing's AI Chatbot Into a Convincing Scammer, Researchers Say". Vice. Retrieved 2023-06-17.
  60. ^ Greshake, Kai; Abdelnabi, Sahar; Mishra, Shailesh; Endres, Christoph; Holz, Thorsten; Fritz, Mario (2023-02-01). "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection". arXiv:2302.12173 [cs.CR].
  61. ^ Lanyado, Bar (2023-06-06). "Can you trust ChatGPT's package recommendations?". Vulcan Cyber. Retrieved 2023-06-17.