January 10th Big Model Daily Collection

News1years go (2024)release AIWindVane

175 0 0

[January 10th Big Model Daily Collection] Waving the magic wand of code corpus, big models and agents will summon more powerful energy

The Mixtral 8x7B paper is finally here: architecture details and parameter quantities are exposed for the first time

Link: https://news.miracleplus.com/share_link/15457

Some time ago, the Mixtral 8x7B MoE model paper that ignited the entire open source community was released. Previously, the OpenAI team had been tight-lipped about the parameter quantities and training details of GPT-4. The release of Mistral 8x7B undoubtedly provides developers with an open source option that is “very close to GPT-4”. You know, someone broke the news a long time ago that OpenAI also uses the “Mixture of Experts (MoE)” architecture to build GPT-4. With the release of the paper, some research details were also announced.

Discover process dynamics for scalable perovskite solar cell manufacturing using explainable AI

Link: https://news.miracleplus.com/share_link/15458

Large-area processing of perovskite semiconductor films is complex and can cause unexplained quality differences, becoming a major obstacle to the commercialization of perovskite photovoltaics. Advances in scalable manufacturing processes are currently limited to an incremental and arbitrary trial-and-error process. While in situ acquisition of photoluminescence videos has the potential to reveal important changes during film formation, the high dimensionality of the data quickly exceeds the limits of human analysis. Interactive Machine Learning Group, Helmholtz Imaging Group, German Cancer Research Center, and Light Technology Institute, Karlsruhe Institute of Technology An interdisciplinary team of researchers at the Institute of Technology used deep learning and explainable artificial intelligence (XAI) to discover connections between sensor information acquired during perovskite film formation and the resulting solar cell performance metrics. relationships while making those relationships understandable. The researchers further demonstrate how the insights gained can be distilled into actionable recommendations for perovskite film processing to advance industrial-scale solar cell manufacturing.

Waving the magic wand of code corpus, large models and intelligent agents will summon more powerful energy

Link: https://news.miracleplus.com/share_link/15459

Just as Rhysford’s wand created the legend of extraordinary magicians of all ages, such as Dumbledore, the traditional large-scale language model with huge potential has mastered execution beyond its original source after being pre-trained/fine-tuned with code corpus. force. Specifically, the advanced version of the large model has been improved in terms of code writing, stronger reasoning, independent invocation of execution interfaces, and independent improvement, which will bring benefits to it as an AI agent and to perform downstream tasks in all aspects. Recently, a research team from the University of Illinois at Urbana-Champaign (UIUC) published an important review. This review explores how code gives powerful capabilities to large language models (LLMs) and their intelligent agents (Intelligent Agents) based on them.

More cost-effective than A100! FlightLLM allows large model inference to no longer worry about performance and cost at the same time.

Link: https://news.miracleplus.com/share_link/15460

The large-scale application of large language models on the device side has “lifted and pulled” the demand for computing performance and energy efficiency ratio, opening up a sufficient competition field for reasoning between algorithms and chips. Facing imaginary terminal scenarios, the application potential of GPU- and FPGA-based inference solutions needs to be re-examined. Recently, Wuwen Core, Tsinghua University and Shanghai Jiao Tong University jointly proposed a lightweight deployment process for large models for FPGAs, achieving efficient inference of LLaMA2-7B on a single Xilinx U280 FPGA for the first time. The first author is Zeng Shulin, a Ph.D. in the Department of Electronics at Tsinghua University and the head of hardware at Wuwen Core Dome. The corresponding author is Dai Guohao, associate professor at Shanghai Jiao Tong University, co-founder and chief scientist of Wuwen Core Dome, and professor and director of the Department of Electronic Engineering at Tsinghua University. Wang Yu, the founder of Wuwen Xinqiong. Related work has now been accepted into FPGA’24, the top conference in the field of reconfigurable computing.