Skip to content

Commit 07fd98e

Browse files
committed
Add AGENTS.md
1 parent 281abfc commit 07fd98e

File tree

1 file changed

+69
-0
lines changed

1 file changed

+69
-0
lines changed

AGENTS.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
For reference, you can look at an example complete PR adding SmolLM3 LLM [here](https://github.com/elixir-nx/bumblebee/pull/422/files), and another one adding Swin image classification model [here](https://github.com/elixir-nx/bumblebee/pull/394/files).
2+
3+
The main steps of adding a new model are the following:
4+
5+
1. Find the Python implementation and configuration files for the model in the `huggingface/transformers` project, for example [modeling_smollm3.py](https://github.com/huggingface/transformers/blob/v5.0.0rc1/src/transformers/models/smollm3/modeling_smollm3.py) and [configuration_smollm3.py](https://github.com/huggingface/transformers/blob/v5.0.0rc1/src/transformers/models/smollm3/configuration_smollm3.py).
6+
7+
2. Look at some existing model implementations in Bumblebee. In case of LLMs, copying an existing LLM implementation is typically a good starting point.
8+
9+
3. Implement the model code.
10+
- Whenever possible, reuse existing primitives, most notably `Layers.Transformer.blocks/2`, which is shared for most LLM implementations. Sometimes models introduce novelties to the transformer design, in which case it may be necessary to add a new option to `Layers.Transformer.blocks/2`.
11+
- Include relevant options from Python model configuration as Bumblebee model options (with matching defaults).
12+
- Make sure the `params_mapping/1` maps to correct Python layer names. You can use `Bumblebee.load_model(..., log_params_diff: true)` to get all logs related to params loading.
13+
14+
4. Add tests for each of the model architectures. Look at existing tests for reference. The tests should verify a slice of model output matches **reference values obtained from running the Python model**. The values can be obtained using a Python script like this:
15+
16+
```python
17+
from transformers import BertModel
18+
import torch
19+
20+
model = BertModel.from_pretrained("hf-internal-testing/tiny-random-BertModel")
21+
22+
inputs = {
23+
"input_ids": torch.tensor([[10, 20, 30, 40, 50, 60, 70, 80, 0, 0]]),
24+
"attention_mask": torch.tensor([[1, 1, 1, 1, 1, 1, 1, 1, 0, 0]])
25+
}
26+
27+
outputs = model(**inputs)
28+
29+
print(outputs.last_hidden_state.shape)
30+
print(outputs.last_hidden_state[:, 1:4, 1:4])
31+
32+
#=> torch.Size([1, 10, 32])
33+
#=> tensor([[[-0.2331, 1.7817, 1.1736],
34+
#=> [-1.1001, 1.3922, -0.3391],
35+
#=> [ 0.0408, 0.8677, -0.0779]]], grad_fn=<SliceBackward0>)
36+
```
37+
38+
For the tests, try finding model repositories in the [hf-internal-testing](https://huggingface.co/hf-internal-testing) organization. If there is no repository for the given model, you can use any other repository or local checkpoint - once you open the PR we will create a repository under [bumblebee-testing](https://huggingface.co/bumblebee-testing). To generate a checkpoint locally, you can use a Python script like this:
39+
40+
```python
41+
from transformers import SmolLM3Config, SmolLM3Model, SmolLM3ForCausalLM, SmolLM3ForQuestionAnswering, SmolLM3ForSequenceClassification, SmolLM3ForTokenClassification
42+
43+
config = SmolLM3Config(
44+
vocab_size=1024,
45+
hidden_size=32,
46+
num_hidden_layers=2,
47+
num_attention_heads=4,
48+
intermediate_size=37,
49+
hidden_act="gelu",
50+
hidden_dropout_prob=0.1,
51+
attention_probs_dropout_prob=0.1,
52+
max_position_embeddings=512,
53+
type_vocab_size=16,
54+
is_decoder=False,
55+
initializer_range=0.02,
56+
pad_token_id=0,
57+
no_rope_layers=[0, 1]
58+
)
59+
60+
for c in [SmolLM3Model, SmolLM3ForCausalLM, SmolLM3ForQuestionAnswering, SmolLM3ForSequenceClassification, SmolLM3ForTokenClassification]:
61+
name = c.__name__
62+
c(config).save_pretrained(f"bumblebee-testing/tiny-random-{name}", repo_id=f"bumblebee-testing/tiny-random-{name}")
63+
```
64+
65+
You may need to adjust the configuration for the new model accordingly.
66+
67+
5. If the model uses a new type of tokenizer, you may need to add a new tokenizer mapping to `@tokenizer_types` in `lib/bumblebee/text/pre_trained_tokenizer.ex`, and a corresponding test in `test/bumblebee/text/pre_trained_tokenizer_test.exs`.
68+
69+
6. Finally, it is highly advisable to try the model end-to-end with a real-world model checkpoint from [HuggingFace Hub](https://huggingface.co/models), to make sure it produces expected output. Given that models can have different configuration, it is possible to miss some relevant code path or option when testing solely against a tiny-random checkpoint.

0 commit comments

Comments
 (0)