Post

BigCode
BigCode@BigCodeProject·
SantaCoder is trained on Python, Java, and JavaScript and outperforms other large multilingual models such as InCoder (6.7B) or CodeGen-multi (2.7B) considerably! A lot of pieces from a lot of collaborators came together to get to that result:
BigCode tweet media
English
1
3
26
5.6K
BigCode
BigCode@BigCodeProject·
The foundation to train SantaCoder is The Stack (v1.1) dataset. Given the relatively small size of our model (1B parameters) we chose three popular programming languages: Python, Java, and JavaScript. You can check if your code was used for training here: huggingface.co/spaces/bigcode…
English
1
2
16
3.4K
Paylaş