Skip to content

Commit 0d0929c

Browse files
authored
Merge pull request #817 from vlwk/master
Imperceptible Perturbations support for TextAttack
2 parents 5fbb076 + 872f3af commit 0d0929c

File tree

81 files changed

+4520
-159
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+4520
-159
lines changed

README.md

Lines changed: 47 additions & 52 deletions
Large diffs are not rendered by default.

docs/1start/attacks4Components.md

Lines changed: 19 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,15 @@
1-
Four Components of TextAttack Attacks
2-
========================================
3-
4-
To unify adversarial attack methods into one system, We formulate an attack as consisting of four components: a **goal function** which determines if the attack has succeeded, **constraints** defining which perturbations are valid, a **transformation** that generates potential modifications given an input, and a **search method** which traverses through the search space of possible perturbations. The attack attempts to perturb an input text such that the model output fulfills the goal function (i.e., indicating whether the attack is successful) and the perturbation adheres to the set of constraints (e.g., grammar constraint, semantic similarity constraint). A search method is used to find a sequence of transformations that produce a successful adversarial example.
5-
1+
# Four Components of TextAttack Attacks
62

3+
To unify adversarial attack methods into one system, We formulate an attack as consisting of four components: a **goal function** which determines if the attack has succeeded, **constraints** defining which perturbations are valid, a **transformation** that generates potential modifications given an input, and a **search method** which traverses through the search space of possible perturbations. The attack attempts to perturb an input text such that the model output fulfills the goal function (i.e., indicating whether the attack is successful) and the perturbation adheres to the set of constraints (e.g., grammar constraint, semantic similarity constraint). A search method is used to find a sequence of transformations that produce a successful adversarial example.
74

85
This modular design enables us to easily assemble attacks from the literature while re-using components that are shared across attacks. TextAttack provides clean, readable implementations of 16 adversarial attacks from the literature. For the first time, these attacks can be benchmarked, compared, and analyzed in a standardized setting.
96

10-
117
- Two examples showing four components of two SOTA attacks
12-
![two-categorized-attacks](/_static/imgs/intro/01-categorized-attacks.png)
8+
![two-categorized-attacks](/_static/imgs/intro/01-categorized-attacks.png)
139

10+
- You can create one new attack (in one line of code!!!) from composing members of four components we proposed, for instance:
1411

15-
- You can create one new attack (in one line of code!!!) from composing members of four components we proposed, for instance:
16-
17-
```bash
12+
```bash
1813
# Shows how to build an attack from components and use it on a pre-trained model on the Yelp dataset.
1914
textattack attack --attack-n --model bert-base-uncased-yelp --num-examples 8 \
2015
--goal-function untargeted-classification \
@@ -39,27 +34,20 @@ A `Transformation` takes as input an `AttackedText` and returns a list of possib
3934

4035
A `SearchMethod` takes as input an initial `GoalFunctionResult` and returns a final `GoalFunctionResult` The search is given access to the `get_transformations` function, which takes as input an `AttackedText` object and outputs a list of possible transformations filtered by meeting all of the attack’s constraints. A search consists of successive calls to `get_transformations` until the search succeeds (determined using `get_goal_results`) or is exhausted.
4136

42-
43-
4437
### On Benchmarking Attack Recipes
4538

46-
- Please read our analysis paper: Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples at [EMNLP BlackBoxNLP](https://arxiv.org/abs/2009.06368).
47-
48-
- As we emphasized in the above paper, we don't recommend to directly compare Attack Recipes out of the box.
49-
50-
- This is due to that attack recipes in the recent literature used different ways or thresholds in setting up their constraints. Without the constraint space held constant, an increase in attack success rate could come from an improved search or a better transformation method or a less restrictive search space.
51-
39+
- Please read our analysis paper: Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples at [EMNLP BlackBoxNLP](https://arxiv.org/abs/2009.06368).
5240

41+
- As we emphasized in the above paper, we don't recommend to directly compare Attack Recipes out of the box.
5342

54-
### Four components in Attack Recipes we have implemented
43+
- This is due to that attack recipes in the recent literature used different ways or thresholds in setting up their constraints. Without the constraint space held constant, an increase in attack success rate could come from an improved search or a better transformation method or a less restrictive search space.
5544

45+
### Four components in Attack Recipes we have implemented
5646

5747
- TextAttack provides clean, readable implementations of 16 adversarial attacks from the literature.
5848

5949
- To run an attack recipe: `textattack attack --recipe [recipe_name]`
6050

61-
62-
6351
<table style="width:100%" border="1">
6452
<thead>
6553
<tr class="header">
@@ -224,13 +212,21 @@ A `SearchMethod` takes as input an initial `GoalFunctionResult` and returns a fi
224212
<td ><sub>Greedy attack with goal of changing every word in the output translation. Currently implemented as black-box with plans to change to white-box as done in paper (["Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples" (Cheng et al., 2018)](https://arxiv.org/abs/1803.01128)) </sub> </td>
225213
</tr>
226214

215+
<tr><td style="text-align: center;" colspan="6"><strong><br>General: <br></strong></td></tr>
216+
217+
<tr class="odd">
218+
<td style="text-align: left;"><code>bad-characters</code> <span class="citation" data-cites=""></span></td>
219+
<td style="text-align: left;"><sub>TargetedClassification, TargetedStrict, TargetedBonus, NamedEntityRecognition, LogitSum, MinimizeBleu, MaximizeLevenshtein</sub></td>
220+
<td style="text-align: left;"></td>
221+
<td style="text-align: left;"><sub>(Homoglyph, Invisible Characters, Reorderings, Deletions) Word Swap</sub></td>
222+
<td style="text-align: left;"><sub>DifferentialEvolution</sub></td>
223+
<td><sub>Uses imperceptible character-level perturbations including homoglyph substitutions, Unicode reordering, deletions, and invisibles. Based on (["Bad Characters: Imperceptible NLP Attacks" (Boucher et al., 2021)](https://arxiv.org/abs/2106.09898)).</sub></td>
224+
</tr>
227225

228226
</tbody>
229227
</font>
230228
</table>
231229

232-
233-
234230
- Citations
235231

236232
```

docs/3recipes/attack_recipes.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -160,5 +160,16 @@ Attacks on sequence-to-sequence models
160160
:noindex:
161161

162162

163+
General
164+
############################################
165+
166+
19. BadCharacters (Bad Characters: Imperceptible NLP Attacks)
167+
168+
169+
.. automodule:: textattack.attack_recipes.bad_characters_2021
170+
:members:
171+
:noindex:
172+
173+
163174

164175

docs/3recipes/attack_recipes_cmd.md

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,24 @@
22

33
We provide a number of pre-built attack recipes, which correspond to attacks from the literature.
44

5-
65
## Help: `textattack --help`
76

87
TextAttack's main features can all be accessed via the `textattack` command. Two very
98
common commands are `textattack attack <args>`, and `textattack augment <args>`. You can see more
109
information about all commands using
10+
1111
```bash
1212
textattack --help
1313
```
14+
1415
or a specific command using, for example,
16+
1517
```bash
1618
textattack attack --help
1719
```
1820

1921
The [`examples/`](https://github.com/QData/TextAttack/tree/master/examples) folder includes scripts showing common TextAttack usage for training models, running attacks, and augmenting a CSV file.
2022

21-
2223
The [documentation website](https://textattack.readthedocs.io/en/latest) contains walkthroughs explaining basic usage of TextAttack, including building a custom transformation and a custom constraint..
2324

2425
## Running Attacks: `textattack attack --help`
@@ -29,17 +30,20 @@ The easiest way to try out an attack is via the command-line interface, `textatt
2930
3031
Here are some concrete examples:
3132

32-
*TextFooler on BERT trained on the MR sentiment classification dataset*:
33+
_TextFooler on BERT trained on the MR sentiment classification dataset_:
34+
3335
```bash
3436
textattack attack --recipe textfooler --model bert-base-uncased-mr --num-examples 100
3537
```
3638

37-
*DeepWordBug on DistilBERT trained on the Quora Question Pairs paraphrase identification dataset*:
39+
_DeepWordBug on DistilBERT trained on the Quora Question Pairs paraphrase identification dataset_:
40+
3841
```bash
3942
textattack attack --model distilbert-base-uncased-cola --recipe deepwordbug --num-examples 100
4043
```
4144

42-
*Beam search with beam width 4 and word embedding transformation and untargeted goal function on an LSTM*:
45+
_Beam search with beam width 4 and word embedding transformation and untargeted goal function on an LSTM_:
46+
4347
```bash
4448
textattack attack --model lstm-mr --num-examples 20 \
4549
--search-method beam-search^beam_width=4 --transformation word-swap-embedding \
@@ -55,7 +59,6 @@ We include attack recipes which implement attacks from the literature. You can l
5559

5660
To run an attack recipe: `textattack attack --recipe [recipe_name]`
5761

58-
5962
<table style="width:100%" border="1">
6063
<thead>
6164
<tr class="header">
@@ -220,23 +223,33 @@ To run an attack recipe: `textattack attack --recipe [recipe_name]`
220223
<td ><sub>Greedy attack with goal of changing every word in the output translation. Currently implemented as black-box with plans to change to white-box as done in paper, from <a href="https://arxiv.org/abs/1803.01128">"Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples" (Cheng et al., 2018)</a></sub> </td>
221224
</tr>
222225

226+
<tr><td style="text-align: center;" colspan="6"><strong><br>General: <br></strong></td></tr>
227+
228+
<tr>
229+
<td><code>bad-characters</code> <span class="citation" data-cites=""></span></td>
230+
<td><sub>TargetedClassification, TargetedStrict, TargetedBonus, NamedEntityRecognition, LogitSum, MinimizeBleu, MaximizeLevenshtein</sub> </td>
231+
<td></td>
232+
<td><sub>(Homoglyph, Invisible Characters, Reorderings, Deletions) Word Swap</sub> </td>
233+
<td><sub>DifferentialEvolution</sub></td>
234+
<td ><sub>Uses imperceptible character-level perturbations including homoglyph substitutions, Unicode reordering, deletions, and invisibles. Based on (["Bad Characters: Imperceptible NLP Attacks" (Boucher et al., 2021)](https://arxiv.org/abs/2106.09898)).</sub> </td>
235+
</tr>
223236

224237
</tbody>
225238
</font>
226239
</table>
227240

228-
229-
230241
## Recipe Usage Examples
231242

232243
Here are some examples of testing attacks from the literature from the command-line:
233244

234-
*TextFooler against BERT fine-tuned on SST-2:*
245+
_TextFooler against BERT fine-tuned on SST-2:_
246+
235247
```bash
236248
textattack attack --model bert-base-uncased-sst2 --recipe textfooler --num-examples 10
237249
```
238250

239-
*seq2sick (black-box) against T5 fine-tuned for English-German translation:*
251+
_seq2sick (black-box) against T5 fine-tuned for English-German translation:_
252+
240253
```bash
241254
textattack attack --model t5-en-de --recipe seq2sick --num-examples 100
242255
```

docs/api/goal_functions.rst

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,26 @@ GoalFunction
99
.. autoclass:: textattack.goal_functions.GoalFunction
1010
:members:
1111

12+
LogitSum
13+
--------------------------
14+
.. autoclass:: textattack.goal_functions.LogitSum
15+
:members:
16+
17+
NamedEntityRecognition
18+
--------------------------
19+
.. autoclass:: textattack.goal_functions.NamedEntityRecognition
20+
:members:
21+
22+
TargetedStrict
23+
--------------------------
24+
.. autoclass:: textattack.goal_functions.TargetedStrict
25+
:members:
26+
27+
TargetedBonus
28+
--------------------------
29+
.. autoclass:: textattack.goal_functions.TargetedBonus
30+
:members:
31+
1232
ClassificationGoalFunction
1333
--------------------------
1434
.. autoclass:: textattack.goal_functions.classification.ClassificationGoalFunction
@@ -44,3 +64,8 @@ NonOverlappingOutput
4464
.. autoclass:: textattack.goal_functions.text.NonOverlappingOutput
4565
:members:
4666

67+
MaximizeLevenshtein
68+
--------------------------
69+
.. autoclass:: textattack.goal_functions.text.MaximizeLevenshtein
70+
:members:
71+

docs/api/search_methods.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,3 +44,8 @@ ParticleSwarmOptimization
4444
.. autoclass:: textattack.search_methods.ParticleSwarmOptimization
4545
:members:
4646

47+
DifferentialEvolution
48+
--------------------------
49+
.. autoclass:: textattack.search_methods.DifferentialEvolution
50+
:members:
51+

docs/apidoc/textattack.attack_recipes.rst

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ textattack.attack\_recipes package
66
:undoc-members:
77
:show-inheritance:
88

9-
9+
Submodules
10+
----------
1011

1112

1213
.. automodule:: textattack.attack_recipes.a2t_yoo_2021
@@ -21,6 +22,12 @@ textattack.attack\_recipes package
2122
:show-inheritance:
2223

2324

25+
.. automodule:: textattack.attack_recipes.bad_characters_2021
26+
:members:
27+
:undoc-members:
28+
:show-inheritance:
29+
30+
2431
.. automodule:: textattack.attack_recipes.bae_garg_2019
2532
:members:
2633
:undoc-members:
@@ -39,6 +46,12 @@ textattack.attack\_recipes package
3946
:show-inheritance:
4047

4148

49+
.. automodule:: textattack.attack_recipes.chinese_recipe
50+
:members:
51+
:undoc-members:
52+
:show-inheritance:
53+
54+
4255
.. automodule:: textattack.attack_recipes.clare_li_2020
4356
:members:
4457
:undoc-members:
@@ -57,6 +70,12 @@ textattack.attack\_recipes package
5770
:show-inheritance:
5871

5972

73+
.. automodule:: textattack.attack_recipes.french_recipe
74+
:members:
75+
:undoc-members:
76+
:show-inheritance:
77+
78+
6079
.. automodule:: textattack.attack_recipes.genetic_algorithm_alzantot_2018
6180
:members:
6281
:undoc-members:
@@ -117,6 +136,12 @@ textattack.attack\_recipes package
117136
:show-inheritance:
118137

119138

139+
.. automodule:: textattack.attack_recipes.spanish_recipe
140+
:members:
141+
:undoc-members:
142+
:show-inheritance:
143+
144+
120145
.. automodule:: textattack.attack_recipes.textbugger_li_2018
121146
:members:
122147
:undoc-members:

docs/apidoc/textattack.attack_results.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ textattack.attack\_results package
66
:undoc-members:
77
:show-inheritance:
88

9-
9+
Submodules
10+
----------
1011

1112

1213
.. automodule:: textattack.attack_results.attack_result

docs/apidoc/textattack.augmentation.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ textattack.augmentation package
66
:undoc-members:
77
:show-inheritance:
88

9-
9+
Submodules
10+
----------
1011

1112

1213
.. automodule:: textattack.augmentation.augmenter

docs/apidoc/textattack.commands.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ textattack.commands package
66
:undoc-members:
77
:show-inheritance:
88

9-
9+
Submodules
10+
----------
1011

1112

1213
.. automodule:: textattack.commands.attack_command

0 commit comments

Comments
 (0)