Big Model Daily, November 15

News7months agoupdate AIWindVane
39 0

Big Model Daily, November 15

[Big Model Daily, November 15] Research: The top 50 most visited AI tools have accumulated more than 24 billion visits in the past 10 months, and ChatGPT alone accounted for 60% of the traffic; due to excessive usage, Open AI suspended ChatGPT Plus New user registration; Notion launches a question and answer function, which can intelligently ask and retrieve note content with AI

Research: The top 50 most visited AI tools have accumulated more than 24 billion visits in the past 10 months, and ChatGPT alone accounted for 60% of the traffic

Writerbuddy used SEMrush to tally website traffic data from September 2022 to August 2023. The top 50 artificial intelligence tools attracted more than 24 billion visits. Among them, ChatGPT leads the way with 14 billion visits, accounting for more than 60% of analyzed traffic.

Due to excessive usage, Open AI suspends new user registration for ChatGPT Plus

OpenAI’s paid service ChatGPT Plus has suspended registration due to excessive usage. On November 15, OpenAI CEO Sam Altman said on affordability, we want to make sure everyone has a great experience. You can still sign up within the app to be notified when it reopens for subscribers.”

Real-time image speed increased by 5-10 times, Tsinghua LCM/LCM-LoRA became popular, with over one million views and over 200,000 downloads

Paintings from pictures and pictures from pictures are nothing new. But in the process of using these tools, we found that they often ran slowly, causing us to wait for a while to get the generated results. But recently, a model called “LCM” has changed this situation, and it can even continuously generate images in real time. The full name of LCM is Latent Consistency Models (latent consistency model), which was built by researchers from the Interdisciplinary Information Institute of Tsinghua University. Before the release of this model, latent diffusion models (LDM) such as Stable Diffusion were very slow to generate due to the computational complexity of the iterative sampling process. Through some innovative methods, LCM can generate high-resolution images with only a few steps of inference. According to statistics, LCM can increase the efficiency of mainstream Vincentian graph models by 5-10 times, so it can show real-time effects. On this basis, the research team further proposed LCM-LoRA, which can transfer the fast sampling ability of LCM to other LoRA models without any additional training, providing a large number of models with different styles that already exist in the open source community. A straightforward and effective solution.

Fast and accurate! DeepMind releases AI weather forecast model, beating the world’s most advanced system in 90% of indicators

The AI weather forecast model GraphCast AI released by Google DeepMind is fast and accurate, surpassing traditional forecasting methods for the first time. Among 1,380 indicators, GraphCast AI outperforms ECMWF’s system in 90% of indicators. GraphCast AI runs on Google’s TPU v4 cloud computer and can generate a 10-day weather forecast in one minute.

Microsoft uses GPT-4V to decode videos, understand movies, and explain them to blind people. 1 hour is not a problem

Large models that have almost mastered language capabilities are entering the visual field, but the landmark GPT-4V still has many shortcomings. See “After trying GPT-4V, Microsoft wrote a 166-page evaluation report, industry insiders : A must-read for advanced users. Recently, Microsoft Azure AI integrated GPT-4V with some special tools to create a more powerful MM-Vid, which not only has the basic capabilities of other LMMs, but can also analyze hour-long videos and explanation videos for The visually impaired listen.

Microsoft Bing Chat tests nosearch mode, which can turn off/on the Internet to search for real-time information

Microsoft recently invited some Bing Chat users to test the “nosearch” mode in the form of a plug-in. After users turn on this mode, they do not rely on the massive information on the Internet to answer questions, but provide more accurate and relevant information that meets the user’s preferences and needs.

Robin Li: Currently 20% of Baidu’s code is completed by AI. Too many large basic models are a waste.

Robin Li said that Baidu has resolutely restructured its product lines to be AI-native. For every 100 lines of code at Baidu, 20 lines are completed by AI. He said that large models are the basic base and there are not too many similar operating systems. Repeated development of large models is a waste of basic resources.

Notion launches Q&A function, which can intelligently ask and retrieve note content with AI

Notion, the popular all-in-one collaboration application for notes, documents, and databases, has released a new artificial intelligence feature called “Q&A” that allows users to query and retrieve information stored in their Notion workspace. Messages and notes to get instant answers to your questions.

Launched two weeks earlier than ChatGPT, Galactica was kicked off the production line and became LeCun’s biggest frustration.

Today when we mention large language models (LLM), the first thing that comes to mind is OpenAI’s ChatGPT. Over the past year, ChatGPT has become popular due to its powerful performance and wide application prospects. But when it comes to large-scale language models, ChatGPT is not the first. A year ago, two weeks before OpenAI released ChatGPT, Meta released a trial model called Galactica. As a large-scale language model, Galactica is trained on a massive scientific corpus of papers, reference materials, knowledge bases, and many other sources, including more than 48 million papers, textbooks, and handouts, millions of chemical compounds, and protein knowledge, Scientific websites, encyclopedias, etc. At the time, Meta claimed that Galactica could summarize academic literature, solve mathematical problems, generate Wiki articles, write scientific code, and even perform multimodal tasks involving chemical formulas and protein sequences. However, less than three days after it went online, Galactica was quickly removed from the shelves. The reason was that the text generated by Galactica was not rigorous and irresponsibly fabricated.

S-LoRA: It is possible to run thousands of large models on one GPU

Generally speaking, the deployment of large language models will adopt the “pre-training and then fine-tuning” model. However, when the base model is fine-tuned for numerous tasks (such as personalized assistants), training and serving costs can become prohibitively expensive. LowRank Adaptation (LoRA) is a parameter-efficient fine-tuning method that is often used to adapt base models to a variety of tasks, resulting in a large number of LoRA adapters derived from a base model. This pattern provides numerous opportunities for batch reasoning in service processes. LoRA research has shown that fine-tuning only adapter weights can achieve performance comparable to full-weight fine-tuning. While this approach enables low-latency inference for a single adapter and serial execution across adapters, it significantly reduces overall service throughput and increases overall latency when serving multiple adapters simultaneously. In summary, the question of how to serve these fine-tuned variants at scale remains unresolved. In a recent paper, researchers from UC Berkeley, Stanford and other universities proposed a new fine-tuning method called S-LoRA.

© Copyright notes

Related posts

No comments

No comments...