BlaGPT Brings Modular Language Model Benchmarking to Small-Scale Research
GitHub user erogol's BlaGPT offers an open-source research sandbox for evaluating LM architectures and components on compact datasets.
GitHub user erogol's BlaGPT offers an open-source research sandbox for evaluating LM architectures and components on compact datasets.
A new architecture called SubQ targets 12 million token context windows while sidestepping the quadratic compute scaling that limits standard transformers.