Tech Press Review - Oct 11th 2023
---
In a major step towards empowering software creators, Replit has announced the availability of its AI features for all developers. With code completion and assistance now enabled by default, over 23 million developers can benefit from Replit AI. While basic AI features are accessible to free users, Pro users will enjoy exclusive access to advanced AI models and features. The release of replit-code-v1.5-3b, a state-of-the-art 3B Language Model with a code-heavy pretraining mixture, further enhances code completion on the Replit editor. As Replit looks ahead, their roadmap emphasizes that AI will redefine every single feature of the platform. They believe that AI should play a fundamental role in the editing and deployment of software, making it an essential component. In line with this vision, Replit aims to become synonymous with AI for software creators, putting their mission of empowering the next billion software creators at the forefront.
Source => https://blog.replit.com/ai4all
---
Attention Sinks in LLMs for Endless Fluency is a community blog post published on October 9, 2023. The post discusses the use of attention sink tokens to improve the fluency and memory usage of pretrained chat-style Language Models (LLMs) such as Llama, Mistral, MPT, Falcon, and GPT-NeoX (Pythia).
LLMs have revolutionized the fields of chatbots and virtual assistants, but they suffer from limitations. Two major restrictions are VRAM usage and loss of fluency. VRAM usage limits the ability to prompt the model sequentially, while loss of fluency occurs as the input grows too long.
To address these limitations, the post introduces window attention and attention sink tokens. Window attention limits the tokens fed to the LLM and keeps memory usage constant. Attention sinks are the first few tokens in the sequence that receive a large amount of attention score. By keeping attention sink tokens in the window, the LLM can maintain fluency even when tokens are evicted.
Experiments using attention sinks show promising results. LLMs using window attention with attention sinks have constant space complexity and stable perplexity. They exhibit better fluency and memory usage compared to models loaded with transformers.
The post also provides practical information on implementing attention sinks. The attention_sinks Python module can be used as a drop-in replacement for the transformers API. It supports various LLM architectures and allows for configuring the attention sink cache size and window size.
The benchmarks and experiments mentioned in the post demonstrate the effectiveness of attention sinks in improving LLM stability and fluency. Attention sinks are recommended for organizations and users interested in using assistant-style LLMs.
To learn more about attention sinks and their application in LLMs, additional sources and a FAQ section are provided.
Source => https://huggingface.co/blog/tomaarsen/attention-sinks
---
Robotic Transformer 2 (RT-2) is a cutting-edge vision-language-action (VLA) model that combines web and robotics data to provide generalized instructions for robotic control. Unlike high-capacity vision-language models (VLMs) that are trained on web-scale datasets, RT-2 learns directly from robot data, enabling it to understand and respond to visual and language patterns across different languages.
Building upon its predecessor, Robotic Transformer 1 (RT-1), RT-2 surpasses the capabilities of traditional VLMs by incorporating visual-language-action (VLA) learning. By training on multi-task demonstrations using real robot data collected over a period of 17 months in an office kitchen environment, RT-2 demonstrates improved generalization abilities and semantic understanding beyond its original dataset.
One notable feature of RT-2 is its ability to perform chain-of-thought reasoning, enabling it to make multi-stage semantic inferences. For example, it can decide which object can be used as an improvised hammer or determine the best drink for a tired person.
To achieve robotic control, RT-2 utilizes the Pathways Language and Image model (PaLI-X) and Pathways Language model Embodied (PaLM-E) as its backbones. By representing actions as tokens in the model's output, RT-2 can generate sequences of commands for the robot to perform, making it trainable on robotic data without the need to change input and output spaces.
Through rigorous qualitative and quantitative experiments, RT-2 showcases its emergent skills in symbol understanding, reasoning, and human recognition. Compared to previous baselines, RT-2 achieves over a 3x improvement in generalization performance and demonstrates superior performance on both seen and unseen tasks.
In real-world evaluations, RT-2 outperforms previous baselines on tasks like Language Table, achieving a success rate of 90% in simulation and showcasing its ability to generalize to novel objects.
Additionally, RT-2 incorporates chain-of-thought reasoning to enable long-horizon planning and low-level skills within a single model. By combining language and action, RT-2 can execute complex commands that require reasoning about intermediate steps.
The advancements of RT-2 highlight the potential of transforming VLMs into powerful VLA models that can directly control robots. It paves the way for the development of a general-purpose physical robot capable of reasoning, problem-solving, and interpreting information for various real-world tasks.
Source => https://www.deepmind.com/blog/rt-2-new-model-translates-vision-and-language-into-action
---
In a recent news article, the technology website explained the origin and functionality of "Traefik," an open-source reverse proxy and load balancer. The name "Traefik" comes from traefik.io, the platform where it was first developed. When used in conjunction with Docker, Traefik proves to be a valuable tool for local web development.
The article also provided a sample docker-compose.yml file that demonstrates how to use Traefik. By setting up the file with the appropriate configurations, users can easily run multiple services, including Traefik itself. This allows for seamless routing of inbound traffic to different applications.
Launching the docker-compose.yml file with "docker-compose up" and visiting the designated URLs, such as "app1.traefik.me" or "app2.traefik.me" in a web browser, users can access their applications quickly and effortlessly. The article emphasized that there is no need for complex configurations or modifications to the "/etc/hosts" file.
To access the application from another device on the local network, the article provided an additional Docker label. By utilizing the label: "traefik.http.routers.app1.rule=HostRegexp(`app1.{ip:.*}.traefik.me`)", users can reach their app1 Docker container by visiting http://app1.10.0.0.1.traefik.me from any device on the local network. This feature is especially useful for developers looking to test their applications on multiple devices.
For those interested in using Docker Compose with HTTPS, the article recommended checking out a sample docker-compose.yml file for reference.
Overall, Traefik offers a straightforward and convenient solution for managing and routing web applications. Its seamless integration with Docker makes it a valuable asset for developers.
Source => https://traefik.me/
---
Introducing Yasa-1, the groundbreaking multimodal assistant developed by Reka. This language assistant combines visual and auditory sensors to execute actions through code execution. Yasa-1 was trained from scratch, optimizing both the training and serving infrastructure. This advanced assistant offers a range of features, including long context document processing, fast retrieval augmented generation, multilingual support, a search engine interface, and a code interpreter.
Currently in private preview, Yasa-1 is available through APIs and docker containers for on-premise or virtual private cloud deployment. Reka ensures the safe and responsible deployment of Yasa-1 and plans to expand access to enterprise and organization partners in the coming weeks.
One of Yasa-1's impressive capabilities is its multimodal understanding. It can connect to the web and utilize various commercial search engines, providing up-to-date information without any limitations. Yasa-1 can also comprehend private datasets, allowing integration of internal datasets of any modality type.
Yasa-1 includes a long-context model, supporting documents of up to 100K tokens. Reka conducted tests using movie plots and found that Yasa-1 achieved comparable quality while being approximately 8 times faster compared to using a 100K context state-of-the-art model directly.
What sets Yasa-1 apart is its code interpreter feature. This assistant can execute code and provide the results in its response. Whether it's performing arithmetic operations, analyzing spreadsheets, or creating visualizations, Yasa-1 can do it all.
Reka's Yasa-1 is truly revolutionizing the field of multimodal assistants, offering a range of powerful features and capabilities. Stay tuned for more updates on this extraordinary invention. To learn more or request a demo of Yasa-1, reach out to Reka at contact@reka.ai.
Source => https://reka.ai/announcing-our-multimodal-ai-assistant/