Do you feel risky training your language model on limited data? Meet SILO: A new language model that manages risk-performance trade-offs during inference
Legal concerns have been raised about massive language models (LMs) because they are often trained on copyrighted content. The inherent trade-off between legal risk and model performance is at the heart of this topic. Using only permissioned or publicly available data for training has a serious negative impact on accuracy. Since common LM corpora include… Read More »