Hugging Face open-source 8B visual model; OpenAI launches bulk APIs | AI Headlines

One-Minute News Recap!

Global AI Highlights

Stanford University AI Research Institute Releases “2024 AI Index Report”

Stanford University today released the “2024 AI Index Report,” revealing that in 2023, ChatGPT’s daily, weekly, and monthly usage rates were 17%, 36%, and 16% of global users, respectively, with the highest usage rates observed in India, Pakistan, and Kenya. The industry dominates AI research, contributing 51 key models, far exceeding the academic sector’s 15 models. A total of 149 foundational models were released during the year, a significant increase compared to previous years, with the majority being open source. Training costs have skyrocketed, with projects like GPT-4 costing $78 million and Gemini Ultra costing $191 million. Investment in generative AI has grown to $25.2 billion, with the United States leading at $67.2 billion, nearly 9 times more than China’s investment during the same period. Despite a decline in AI job postings, AI has improved work efficiency and quality, with 80% of Fortune 500 companies mentioning AI in their financial reports. The number of AI regulations in the United States has increased to 25, with efforts in Europe and America to promote related policies, leading to a doubling of global attention. Public awareness of AI’s impact has deepened, with 66% of respondents expecting AI to greatly influence future life, but only 37% believing AI will improve work. ChatGPT is widely known, with 63% of respondents aware of it, and half using it at least once a week. AI has made significant advances in science and medicine, but there is a lack of unified standards for responsible AI assessment, with concerns about deepfakes and carbon emissions gaining widespread attention.

OpenAI Sora Empowers Major Update to Adobe Premiere Pro

Adobe announced a major upgrade to Premiere Pro by adding third-party AI video generation plugins such as OpenAI’s Sora, Runway ML’s Gen-2, and Pika 1.0. This move is expected to bring AI tools to a wider user base and may trigger profound changes in the video production industry. In the future, users will be able to seamlessly integrate live-action footage with AI-generated scenes in the same editing interface, such as easily incorporating AI-generated character actions and backgrounds into films, or extending shots and optimizing transition effects. In addition, the Firefly for Video feature will support intelligent object detection and removal, allowing users to quickly change or remove objects in videos, while also having text-to-video image generation capabilities, competing with top AI video generation tools such as Sora and Runway. Adobe believes that the value of AI-generated content lies in integrating it into everyday workflows, helping users embark on innovative journeys.

Hugging Face Launches 8B Vision Large Model Idefics2

Following the release of the Idefics visual language model based on DeepMind technology in 2023, Hugging Face has recently launched its upgraded version, Idefics2. This new model, with 8 billion parameters and fully open source, achieves significant improvements in OCR recognition and image processing. Idefics2 has been streamlined to a scale of 8 billion parameters, comparable to DeepSeek-VL and LLaVA-NeXT-Mistral-7B, and can flexibly handle images with a maximum native resolution of 980 x 980 pixels and any aspect ratio without the need for common square size adjustments seen in traditional CV.

Open Source Link

Former PayPal CEO Dan Schulman: 80% of Jobs Will See Responsibilities Reduced to 20%

Former PayPal CEO Dan Schulman recently spoke at LTF 2024 (Latin America Tech Forum organized by Riverwood Capital at the New York Stock Exchange), stating that the release of GPT-5 would be a moment of panic, with 80% of jobs seeing responsibilities reduced to 20%.

OpenAI Introduces Batch API: Optimizing Costs and Enhancing Asynchronous Task Processing

OpenAI’s developer platform has launched the Batch API, designed for asynchronous tasks such as summarization, translation, and image classification, to save costs and improve processing speed. Users only need to upload batch request files and receive results within 24 hours, enjoying a half-price discount on API prices. This service simplifies large-scale data processing workflows, balancing costs and efficiency, highlighting OpenAI’s commitment to high-value solutions and enhancing the economic viability of AI technology applications in various fields.

Rewind Releases Wearable AI Device Limitless Series, Records Conversations All Day

Rewind has officially launched its new wearable AI product, Limitless, which includes the meeting assistant Limitless Meetings and the wearable pendant Pendant. Limitless Meetings, with its core focus on automated meeting management, intelligent recording, and summarization, is compatible with various meeting platforms. The Pendant, as the world’s smallest AI wearable device, can record conversations all day and store personal insights, equipped with Wi-Fi and Bluetooth capabilities, with a battery life of up to 100 hours. Users can simply touch or long-press to wake up personalized AI and interact with it, review, and retrieve relevant information.

Poe Platform Introduces Multi-Mode Interaction, Leading the AI Chatbot Trend in the Enterprise Market

Poe, the AI chatbot platform under the question-and-answer community Quora, has received a $75 million investment and continues to expand its functionality, aiming to become a one-stop service center for aggregating various conversation AI models. Its innovative feature, “multi-bot chat,” allows users to interact with multiple AI models in a single conversation, such as invoking GPT-4 for analysis, Claude for creative assistance, and DALL-E 3 for image generation on platforms like Slack. Poe aims to optimize user experience by leveraging the increasingly rich AI model ecosystem, integrating the best resources, and with this feature and the upcoming enterprise version, Poe is aggressively entering and leading the AI chatbot market.

WizardLM-2 Series Models Launched, Innovating Training Methods and Synthetic Data Systems

WizardLM has launched the WizardLM-2 series models (8x22B, 70B, 7B) to address the shortage of natural data, using an AI synthetic data training system. Its core strategy includes two main parts:

  1. Fine-tuning of data preprocessing, from data analysis to weighted sampling, ensuring that the model encounters comprehensive and high-quality training materials.

  2. Innovative progressive learning practices, Evol Lab technology enables the model to generate high-quality instructions and improve responses, and through the “AI teaching AI” (AAA) framework, multi-model cross-teaching enhances performance. Additionally, WizardLM-2 combines supervised learning, Stage-DPO reinforcement learning optimization, and RLEIF reward mechanisms to effectively improve model accuracy and adaptability.

Open Source Link (Hugging Face)
GitHub Link

Pile-T5: EleutherAI’s Next-Generation T5 Model Optimized for Code Tasks

EleutherAI’s Pile-T5 model, specifically optimized for handling code tasks that T5 struggled with, uses a more precise LLaMA tokenizer for handling code tokens and doubles the training data to 20 trillion tokens. Although it maintains T5’s hyperparameter settings, Pile-T5 significantly improves performance after fine-tuning by combining T5x technology. In benchmark tests such as SuperGLUE and CodeXGLUE’s “code-to-text” subtasks, Pile-T5 demonstrates outstanding performance surpassing T5-v1.1, especially in code-related domains, showing particularly clear improvements.

Title of this article:<Hugging Face open-source 8B visual model; OpenAI launches bulk APIs | AI Headlines>Author:minimini
Original link:https://www.xxmjw.com/post/24.html
Unless otherwise specified, all content is original. Please indicate when reprinting.

Related

minimini

minimini