Ethereum News Today: ETH and EPFL Launch Fully Open 8B-70B Parameter LLM Trained on 15 Trillion Tokens

Generated by AI AgentCoin World
Tuesday, Aug 5, 2025 10:41 am ET2min read
Aime RobotAime Summary

- ETH Zurich and EPFL launch an open-source LLM with 8B-70B parameters, trained on 15 trillion tokens across 1,500 languages using Switzerland’s carbon-neutral supercomputer.

- The model’s transparency, EU AI Act compliance, and multilingual support contrast with proprietary systems like GPT-4, enabling unrestricted auditing and deployment.

- It enables blockchain innovations like on-chain fraud detection and tokenized data markets, though open-source LLMs face performance gaps and legal risks compared to closed competitors.

- Despite challenges, open-source adoption grows as AI markets expand, with blockchain-AI segments projected to surge from $550M to $4.33B by 2034.

ETH Zurich and EPFL are set to release a groundbreaking open-source large language model (LLLM) that promises to redefine AI research and development. Unlike most commercial models, which operate as "black-box" systems, this model will be fully open, offering public access to its parameters, training data, and code under the Apache 2.0 license. The initiative is being trained on Switzerland’s carbon-neutral “Alps” supercomputer, which utilizes 10,000

Grace-Hopper chips and is powered entirely by renewable energy [1].

The model will come in two configurations: 8 billion and 70 billion parameters, trained on 15 trillion tokens across 1,500 languages. This widespread multilingual coverage contrasts sharply with the English-centric focus of many existing LLMs, offering a more globally inclusive foundation for research and application [1]. The training dataset is also designed to be transparent, with 60% in English and 40% in non-English languages, ensuring broad linguistic representation.

One of the most notable aspects of this project is its commitment to full transparency and compliance with data protection and copyright laws, including adherence to the EU AI Act. The model will enable users to audit, fine-tune, and deploy the LLM without restrictions, offering a stark contrast to proprietary systems like GPT-4, which only provides API access and does not publicly release its parameters or training data [1].

The release of this open-source LLM is expected to drive innovation in areas such as blockchain integration and on-chain inference. Developers can utilize the model within rollup sequencers to enable real-time smart contract summarization and fraud detection. Additionally, the model’s open nature supports the creation of tokenized data marketplaces, where contributors can be fairly rewarded for their input, and DeFi systems can benefit from more deterministic and auditable outputs [1].

While the Swiss model is still under development and its full performance metrics have yet to be released, it already faces competition from other open-source LLMs such as Alibaba’s Qwen3. Qwen3 emphasizes model diversity and deployment efficiency, offering a Mixture-of-Experts (MoE) architecture with up to 235 billion parameters. However, the Swiss LLM focuses on full-stack transparency and multilingual support, with more detailed public access to data sources and training methodologies [1].

Despite these strengths, open-source LLMs face several challenges, including performance and scale gaps compared to proprietary models, implementation instability, high computational demands, and legal uncertainties. The complexity of deployment, combined with documentation deficiencies and the risk of hallucinations in fine-tuned models, can hinder widespread adoption. Additionally, the lack of rigorous governance in open ecosystems can introduce security risks, such as supply-chain vulnerabilities or data leakage [1].

The broader AI market is dominated by closed providers, with over 80% of the market controlled by proprietary models. However, the growing interest in open-source alternatives, driven by transparency, flexibility, and compliance, suggests a shift in how AI is developed and deployed. As the AI market is forecasted to surpass $500 billion, and the blockchain-AI segment is projected to grow from $550 million in 2024 to $4.33 billion by 2034, the potential for open-source LLMs to disrupt the industry is significant [1].

Source: [1] This open-source LLM could redefine AI research, and it’s 100% public (https://cointelegraph.com/explained/this-open-source-llm-could-redefine-ai-research-and-its-100-public)

Comments



Add a public comment...
No comments

No comments yet