While many leading, ai scientists expected brain-sized artificial intelligence models to arrive only by the end of the decade a rather new company in that field already designed and released a world record chip which is meant to train ai models with more than 120 trillion parameters which. Exceeds the size of the human brain by 20 percent, in addition to being able to train the largest models ever made, it also does so, while only requiring a fraction of the other systems power. Welcome to todays episode of ai news in
This episode, i will show you in what ways the new startups cerebral systems world record, ai acceleration chip, is special compared to the ai chips made by other computer giants. What this means for the field of artificial intelligence and finally, in what applications and how soon you will be able to make use of it. Cerebral systems, a silicon valley firm that makes the worlds biggest computer chip announced on tuesday that it can now weave together almost 200 of their
newest and most powerful chips to dramatically lower the amount of power used by artificial intelligence cerebrus is one of a slew of companies developing ai focused chips in an attempt to take on industry heavyweights nvidia corporation and alphabet inks google the business has received 475 million dollars in venture financing and signed agreements with glaxosmithkline and astrazeneca to utilize their chips to speed up medication research hundreds or even thousands of computer chips are traditionally produced on a wafer a 30 centimeter silicon disc that is then cut up into individual chips cerebrals on the other hand utilizes the.
full wafer the cerebrus chip is larger and can handle more data at once however artificial intelligence researchers now have ai models called neural networks that are too large to fit on a single chip therefore they must distribute them across many processors even though the most advanced contemporary neural networks are only a fraction of the complexity of a human brain they consume far more energy than.
Human brains, because the systems that operate them grow less power efficient as more chips are added. Cerebrus said on wednesday that it can combine 192 of its chips to train massive neural networks, but that the power efficiency would remain constant, as the number of processors increases. In other words, cerebrals processors can accomplish twice as much processing for half the power compared to existing systems, which require more than twice as much power to double their computing.
capability cerebrus also enables innovative methods to minimize the amount of computing labor required to discover the solution resulting in a faster time to answer one of the most powerful levers for improving computing efficiency is sparsity in the human brain evolution favored sparsity neurons exhibit activation sparsity which means that not all neurons fire at the same time because not all synapses are completely linked they exhibit weight sparsity human built neural networks contain activation sparsity that prevents all.
neurons from activating at the same time but they are also specified in a very organized dense manner making them over parametrized the principle behind sparsity is simple multiplication by zero is a terrible idea especially when time and power are involved despite this graphics processing units frequently multiply by zero there are many different forms of sparsity in neural networks sparsity may be found in both the activations and the parameters and it can be either organized or unstructured the use of sparsity and other algorithmic methods to decrease the computation floating point operations.
per second necessary to train a model to state-of-the-art accuracy is becoming increasingly essential as the ai community grapples with the exponentially growing expense of training big models cerebras technology allows a single cs2 accelerator which is about the size of a dorm room refrigerator to handle models with over 120 trillion parameters cerebras weight streaming a novel software execution architecture cerebras memory x a memory extension technology cerebras swarm x a high performance connection fabric technology and selectable sparsity a dynamic sparsity harvesting technique are all part of cerebras new technological portfolio for the first time cerebrals weight.
streaming technology allows model parameters to be stored off board while yet offering the same training and inference performance as if they were on chip this novel execution paradigm separates computation and parameter storage allowing researchers to grow size and performance separately while also removing the latency and memory bandwidth problems that plague huge clusters of tiny processors this greatly simplifies the workload allocation approach and allows customers.
to grow from 1 to 192 cs-2s without making any software modifications memory x is a memory extension technique developed by cerebrus memory x will supply up to 2 4 petabytes of high performance memory to the cerebras wafer scale engine in the second generation all of which will act as if it were on chip cs2 can support models with up to 120 trillion parameters thanks to memory x cerebras swarm x has an off-chip extension of cerebrus swarm a high-performance eye optimized communication fabric cerebras will be able to connect up to.
163 million ai optimized cores over up to 192 cs-2s all working together to train a single neural network thanks to swarmax selectable sparsity allows users to choose the weight sparsity level in their model resulting in a direct reduction in floating point operations per second in time to solution weight sparsity is an intriguing topic of machine learning research that has been difficult to investigate due to its inefficiency on graphics processing units selectable sparsity allows the cs2 to speed up work by allowing it to employ any form of sparsity available including.
unstructured and dynamic weight sparsity to deliver faster results with this mix of technologies users will be able to easily unlock brain scale neural networks and spread work over massive clusters of eye optimized processors cerebros has set a new standard for model size compute cluster horsepower and scalability programming ease cerebras revolutionized the business by doubling the size of the biggest networks conceivable larger networks such as gpt-3 have already changed the natural language processing environment allowing for.
previously unthinkable possibilities the industry has progressed beyond one trillion parameter models and were pushing it two orders of magnitude farther allowing brain scale neural networks with 120 trillion parameters in the last several years it has been demonstrated that insights scale directly with parameters in nlp models the more parameters the better the findings cerebras innovations which will give a 100 fold increase in parameter capacity might revolutionize the sector.
scientists will be able to investigate brain-sized models for the first time offering up huge new research and insight opportunities the complexity and time necessary to set up configure and then optimize massive clusters for a given neural network is one of the most significant obstacles of employing them to address ai issues the weight streaming execution paradigm is so elegant in its simplicity and it.
enables for a much more fundamentally simple allocation of work among the tremendous computational capabilities of the cs2 clusters cerebras is eliminating all of the complexity we have today surrounding creating and efficiently using massive clusters with weight streaming propelling the industry ahead on what i believe will be a transformative path cerebrals makes massive cluster creation as simple as pressing a button by combining the concepts of weight streaming memory x and swarmax cerebrus doesnt try to disguise distribution complexity with software cerebras on the other hand has created a completely new design that eliminates.
scaling complexity entirely because of the wse 2 size there’s no need to split a neural networks layers across many cs-2s even todays biggest network layers can fit on a single cs2 unlike gpu clusters which contain distinct parts of the neural network for each graphics processor each cs2 in a cerebros cluster will have the same software configuration adding another cs2 has practically little effect on the works execution thus running a neural network on.
hundreds of cs-2s will appear to a researcher to be the same as running on a single machine it will be as simple as generating a workload for a single computer and applying the same mapping to all the machines in the specified cluster size to set up a cluster cerebras weight streaming technology enables users to run neural network applications on massive clusters of cs2 systems with the programming ease of a single graphics processing unit i mean things i think that are really hard.
about uh having a useful humanoid robot is you cannot navigate through the world without being explicitly trained i mean without explicit like line by line instructions can you can you talk to it and say you know please pick up that bolt and attach it to the car with that wrench and it should be able to do that um it should be able to you know please you know please go to the store and get me the following groceries so what is your.
opinion on this new kind of ai acceleration chip that’s claiming to create models which would surpass the cutting edge of artificial intelligence today by 100 fold do you believe their claims and what applications do you expect will arrive from this new technology please tell us your opinion in the comment section below i would love to hear what you have to say about it thank you for watching ai news we.
consistently report on the newest technologies that are shaping the future of our world wed appreciate you subscribing and watching our other videos see you around and take care.
OpenAI released the most powerful Artificial Intelligence Accelerator Chip in history. With this new chip, which was made by Cerebras Systems, you’ll be able to build an exaflop supercomputer that can outperform the human brain. This makes it possible for your model to support more than 100 trillion parameters, or to train an AI model that is 100 times as intelligent as the average human being.
What does this mean? Simply put, it means that in the very near future you will be able to buy an AI model with more computational power than your own brain. It will be able to do things your brain could only dream about. Just think about how much money this will create! This technology will spawn an entirely new industry of supercomputers dedicated to training AI Models. Think of all the jobs this will create! Think of how many businesses and entrepreneurs this will empower!
This is going to happen, it’s just a matter of time. Now is the time to educate yourself, if you become aware and start using these technologies, if you empower other people with your knowledge and connect them to each other, then the future will be something to celebrate, not fear. The future of computing is here… right now…
Cerebras Systems is a semiconductor company based in the U.S. with offices in Silicon Valley, San Diego, Toronto, and Tokyo. They design, build and market high-performance AI computing and graphics systems for the research, medical, automotive, military, oil & gas, internet, and other markets.