Skip to main content
  1. Data Science Blog/

Power of Chinese AI Models

·438 words·3 mins· loading · ·
AI/ML Models Technology Trends & Future Artificial Intelligence (AI) AI Models AI Industry Research Methods

On This Page

Table of Contents
Share with :

Power of Chinese AI Models

Power of Chinese AI Models
#

Introduction
#

After the Deepseek R1 turmoil in the market, there has been a shift in attention towards China. The West is now looking towards the East, and even those in the East are turning their gaze northward.

I was tracking these models for sometime so thought to summarize them at one place for my readers.

Opensource: 🚀

Partially or fully close source: 🔒

List of Chinese Models
#

DeveloperModelSeriesModelsFeatures of this Model
Tsinghua & Fudan UniversityOpenChineseGPTOpenChineseGPT 🚀Dialogue, instruction-following
Tsinghua & Fudan UniversityOpenBuddyOpenBuddy 🚀Dialogue, instruction-following
Tsinghua & Fudan UniversityOpenChineseLLaMAOpenChineseLLaMA 🚀Dialogue, instruction-following
Shanghai AI LabFengshenbang SeriesFengshenbang-13B 🚀, Fengshenbang-7B 🚀General-purpose, multilingual
IDEA ResearchZiya SeriesZiya-LLaMA 🚀, Ziya-13B 🚀Dialogue, instruction-following
Tsinghua UniversityCPM SeriesCPM-1 🚀, CPM-2 🚀, CPM-3 🚀Early Chinese LLMs
HuaweiPanGuPanGu 🔒Large-scale, multilingual
Tsinghua & Fudan UniversityChinese LLaMA & AlpacaChinese LLaMA 🚀, Chinese Alpaca 🚀Dialogue, instruction-following
Fudan UniversityMOSSMOSS 🚀Dialogue, general-purpose
Zhipu AIChatGLM SeriesChatGLM3 🚀, ChatGLM2 🚀, ChatGLM 🚀, GLM-4 🚀Chinese dialogue, multi-turn, long-context
Alibaba CloudQwen SeriesQwen-1.8B 🚀, Qwen-7B 🚀, Qwen-14B 🚀, Qwen-72B 🚀, Qwen-2.5-1M 🚀Multimodal, multilingual, 32K tokens, strong performance on benchmarks
Baichuan Intelligent TechBaichuan SeriesBaichuan-7B 🚀, Baichuan-13B 🚀, Baichuan2 🚀High performance, quantized versions
Shanghai AI LabInternLM SeriesInternLM 🚀, InternLM-Chat 🚀General-purpose, long-context
01.AIYi SeriesYi-1.0 🚀, Yi-6B 🚀, Yi-34B 🚀Multilingual, long-context
DeepSeek AIDeepSeek SeriesDeepSeek-V2 🚀, DeepSeek-LLM-67B 🚀, DeepSeek-R1 🚀High performance, Chinese & English, advanced reasoning for math and coding
Shenzhen Yuanxiang AIXVERTE SeriesXVERTE-7B 🚀, XVERTE-13B 🚀, XVERTE-65B 🚀Multilingual, 256K tokens
Peking UniversityYuLan SeriesYuLan-Base-126B 🚀, YuLan-Chat-3-126B 🚀Multilingual, large-pretraining
Sichuan AI UniversitygLAWLAW 🚀, LAWMiner 🚀, LLAMA 🚀, Fuzz 🚀, Mingcha 🚀Specialized for legal tasks
BaiduERNIEERNIE 3.0 Titan 🔒Knowledge enhanced with 260 billion parameters, supports multiple industries
ByteDanceDoubaoDoubao 1.5 Pro 🔒Better than ChatGPT-4o in knowledge retention, coding, reasoning, optimized for lower hardware costs
TencentHunyuanHunyuan 🔒Supports image and text generation, logical reasoning, aimed at enterprise use
Moonshot AIKimiKimi k1.5 🔒Matches or outperforms OpenAI o1, focused on solving complex problems
SenseTimeSenseNovaSenseNova 🔒Includes models for natural language processing, content generation, data annotation
MiniMaxMiniMax-TextMiniMax-Text-01 🔒Large parameter size (456 billion), outperforms on some benchmarks, large context window
KuaishouKlingKling 🔒Text-to-video model, free to public, simulates real-world motion and physics
iFlytekiFlytek SparkiFlytek Spark V4.0 🔒Improved core capabilities, ranks high in international tests compared to GPT-4 Turbo
Dr. Hari Thapliyaal's avatar

Dr. Hari Thapliyaal

Dr. Hari Thapliyal is a seasoned professional and prolific blogger with a multifaceted background that spans the realms of Data Science, Project Management, and Advait-Vedanta Philosophy. Holding a Doctorate in AI/NLP from SSBM (Geneva, Switzerland), Hari has earned Master's degrees in Computers, Business Management, Data Science, and Economics, reflecting his dedication to continuous learning and a diverse skill set. With over three decades of experience in management and leadership, Hari has proven expertise in training, consulting, and coaching within the technology sector. His extensive 16+ years in all phases of software product development are complemented by a decade-long focus on course design, training, coaching, and consulting in Project Management. In the dynamic field of Data Science, Hari stands out with more than three years of hands-on experience in software development, training course development, training, and mentoring professionals. His areas of specialization include Data Science, AI, Computer Vision, NLP, complex machine learning algorithms, statistical modeling, pattern identification, and extraction of valuable insights. Hari's professional journey showcases his diverse experience in planning and executing multiple types of projects. He excels in driving stakeholders to identify and resolve business problems, consistently delivering excellent results. Beyond the professional sphere, Hari finds solace in long meditation, often seeking secluded places or immersing himself in the embrace of nature.

Comments:

Share with :

Related

What is a Digital Twin?
·805 words·4 mins· loading
Industry Applications Technology Trends & Future Computer Vision (CV) Digital Twin Internet of Things (IoT) Manufacturing Technology Artificial Intelligence (AI) Graphics
What is a digital twin? # A digital twin is a virtual representation of a real-world entity or …
Frequencies in Time and Space: Understanding Nyquist Theorem & its Applications
·4103 words·20 mins· loading
Data Analysis & Visualization Computer Vision (CV) Mathematics Signal Processing Space Exploration Statistics
Applications of Nyquists theorem # Can the Nyquist-Shannon sampling theorem applies to light …
The Real Story of Nyquist, Shannon, and the Science of Sampling
·1146 words·6 mins· loading
Technology Trends & Future Interdisciplinary Topics Signal Processing Remove Statistics Technology Concepts
The Story of Nyquist, Shannon, and the Science of Sampling # In the early days of the 20th century, …
BitNet b1.58-2B4T: Revolutionary Binary Neural Network for Efficient AI
·2637 words·13 mins· loading
AI/ML Models Artificial Intelligence (AI) AI Hardware & Infrastructure Neural Network Architectures AI Model Optimization Language Models (LLMs) Business Concepts Data Privacy Remove
Archive Paper Link BitNet b1.58-2B4T: The Future of Efficient AI Processing # A History of 1 bit …
Ollama Setup and Running Models
·1753 words·9 mins· loading
AI and NLP Ollama Models Ollama Large Language Models Local Models Cost Effective AI Models
Ollama: Running Large Language Models Locally # The landscape of Artificial Intelligence (AI) and …