Startup World

The increasing integration of robots across various sectors, from industrial manufacturing to daily life, highlights a growing need for advanced navigation systems.
However, contemporary robot navigation systems face significant challenges in diverse and complex indoor environments, exposing the limitations of traditional approaches.
Addressing the fundamental questions of Where am I?, Where am I going?, and How do I get there?, ByteDance has developed Astra, an innovative dual-model architecture designed to overcome these traditional navigation bottlenecks and enable general-purpose mobile robots.Traditional navigation systems typically consist of multiple, smaller, and often rule-based modules to handle the core challenges of target localization, self-localization, and path planning.
Target localization involves understanding natural language or image cues to pinpoint a destination on a map.
Self-localization requires a robot to determine its precise position within a map, especially challenging in repetitive environments like warehouses where traditional methods often rely on artificial landmarks (e.g., QR codes).
Path planning further divides into global planning for rough route generation and local planning for real-time obstacle avoidance and reaching intermediate waypoints.While foundation models have shown promise in integrating smaller models to tackle broader tasks, the optimal number of models and their effective integration for comprehensive navigation remained an open question.
ByteDances Astra, detailed in their paper Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning (website: https://astra-mobility.github.io/), addresses these limitations.
Following the System 1/System 2 paradigm, Astra features two primary sub-models: Astra-Global and Astra-Local.
Astra-Global handles low-frequency tasks like target and self-localization, while Astra-Local manages high-frequency tasks such as local path planning and odometry estimation.
This architecture promises to revolutionize how robots navigate complex indoor spaces.Astra-Global: The Intelligent Brain for Global LocalizationAstra-Global serves as the intelligent core of the Astra architecture, responsible for critical low-frequency tasks: self-localization and target localization.
It functions as a Multimodal Large Language Model (MLLM), adept at processing both visual and linguistic inputs to achieve precise global positioning within a map.
Its strength lies in utilizing a hybrid topological-semantic graph as contextual input, allowing the model to accurately locate positions based on query images or text prompts.The construction of this robust localization system begins with offline mapping.
The research team developed an offline method to build a hybrid topological-semantic graph G=(V,E,L):V (Nodes): Keyframes, obtained by temporal downsampling of input video and SfM-estimated 6-Degrees-of-Freedom (DoF) camera poses, act as nodes encoding camera poses and landmark references.E (Edges): Undirected edges establish connectivity based on relative node poses, crucial for global path planning.L (Landmarks): Semantic landmark information is extracted by Astra-Global from visual data at each node, enriching the maps semantic understanding.
These landmarks store semantic attributes and are connected to multiple nodes via co-visibility relationships.In practical localization, Astra-Globals self-localization and target localization capabilities leverage a coarse-to-fine two-stage process for visual-language localization.
The coarse stage analyzes input images and localization prompts, detects landmarks, establishes correspondence with a pre-built landmark map, and filters candidates based on visual consistency.
The fine stage then uses the query image and coarse output to sample reference map nodes from the offline map, comparing their visual and positional information to directly output the predicted pose.For language-based target localization, the model interprets natural language instructions, identifies relevant landmarks using their functional descriptions within the map, and then leverages landmark-to-node association mechanisms to locate relevant nodes, retrieving target images and 6-DoF poses.To empower Astra-Global with robust localization abilities, the team employed a meticulous training methodology.
Using Qwen2.5-VL as the backbone, they combined Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization (GRPO).
SFT involved diverse datasets for various tasks, including coarse and fine localization, co-visibility detection, and motion trend estimation.
In the GRPO phase, a rule-based reward function (including format, landmark extraction, map matching, and extra landmark rewards) was used to train for visual-language localization.
Experiments showed GRPO significantly improved Astra-Globals zero-shot generalization, achieving 99.9% localization accuracy in unseen home environments, surpassing SFT-only methods.Astra-Local: The Intelligent Assistant for Local PlanningAstra-Local acts as the intelligent assistant for Astras high-frequency tasks, a multi-task network capable of efficiently generating local paths and accurately estimating odometry from sensor data.
Its architecture comprises three core components: a 4D spatio-temporal encoder, a planning head, and an odometry head.The 4D spatio-temporal encoder replaces traditional mobile stack perception and prediction modules.
It begins with a 3D spatial encoder that processes N omnidirectional images through a Vision Transformer (ViT) and Lift-Splat-Shoot to convert 2D image features into 3D voxel features.
This 3D encoder is trained using self-supervised learning via 3D volumetric differentiable neural rendering.
The 4D spatio-temporal encoder then builds upon the 3D encoder, taking past voxel features and future timestamps as input to predict future voxel features through ResNet and DiT modules, providing current and future environmental representations for planning and odometry.The planning head, based on pre-trained 4D features, robot speed, and task information, generates executable trajectories using Transformer-based flow matching.
To prevent collisions, the planning head incorporates a masked ESDF loss (Euclidean Signed Distance Field).
This loss calculates the ESDF of a 3D occupancy map and applies a 2D ground truth trajectory mask, significantly reducing collision rates.
Experiments demonstrate its superior performance in collision rate and overall score on out-of-distribution (OOD) datasets compared to other methods.The odometry head predicts the robots relative pose using current and past 4D features and additional sensor data (e.g., IMU, wheel data).
It trains a Transformer model to fuse information from different sensors.
Each sensor modality is processed by a specific tokenizer, combined with modality embeddings and temporal positional embeddings, fed into a Transformer encoder, and finally uses a CLS token to predict relative pose.
Experiments showed the odometry heads excellent performance in multi-sensor fusion and pose estimation, significantly improving rotational accuracy and reducing overall trajectory error.


Experimental ValidationExtensive experiments were conducted in diverse indoor environments (warehouses, offices, homes) to comprehensively evaluate Astras performance.Astra-Globals multimodal localization capabilities were validated through various experiments, demonstrating superior performance in handling text and image localization queries.
For target localization, it accurately identifies matching images and poses based on text commands (e.g., find the resting area).
Compared to traditional Visual Place Recognition (VPR) methods, Astra-Global exhibits significant advantages in:Detail Capture: Unlike VPRs reliance on global features, Astra-Global precisely captures fine details like room numbers, preventing localization errors in similar scenes.Viewpoint Robustness: Based on semantic landmarks, Astra-Global maintains stable localization even with large camera angle changes, where VPR methods typically fail.Pose Accuracy: Astra-Global leverages landmark spatial relationships to select the best matching pose, showing significantly higher pose accuracy (within 1-meter distance error and 5-degree angular error) than traditional VPR, with over 30% improvement in warehouse environments.Astra-Locals planning and odometry heads were thoroughly evaluated.
The planning head, using Transformer-based flow matching and masked ESDF loss, outperformed methods like ACT and diffusion policies in collision rate, speed, and overall score on OOD datasets.
This highlights the masked ESDF losss effectiveness in mitigating collision risks.The odometry heads performance was assessed on multimodal datasets including synchronized image sequences, IMU, wheel data, and ground truth poses.
Compared to two-frame BEV-ODOM baselines, Astra-Locals odometry head showed significant advantages in multi-sensor fusion and pose estimation.
Integrating IMU data dramatically improved rotational estimation accuracy, reducing overall trajectory error to approximately 2%.
Further inclusion of wheel data enhanced scale stability and estimation accuracy, validating its superior multi-sensor data fusion capabilities.Astra holds significant promise for future development and applications.
Its deployment can be expanded to more complex indoor environments like large shopping malls, hospitals, and libraries, where it can assist in tasks such as precise product location, efficient medical supply delivery, and book organization.However, areas for improvement exist.
For Astra-Global, while current map representations balance information loss and token length, they may occasionally lack critical semantic details.
Future work will focus on alternative map compression methods to optimize efficiency while maximizing semantic information retention.
Additionally, current single-frame localization can fail in feature-scarce or highly repetitive environments; future plans include active exploration mechanisms and temporal reasoning for more robust localization.For Astra-Local, improving robustness to out-of-distribution (OOD) scenarios is crucial, requiring enhanced model architectures and training methods.
Redesigning the fallback system for tighter integration and seamless switching is also planned to improve system stability.
Furthermore, integrating instruction-following capabilities will enable robots to understand and execute natural language commands, expanding their usability in dynamic, human-centric environments and fostering more natural human-robot interaction.Like this:LikeLoading...





Unlimited Portal Access + Monthly Magazine - 12 issues


Contribute US to Start Broadcasting - It's Voluntary!


ADVERTISE


Merchandise (Peace Series)

 


Curated realities: An AI movie celebration and the future of human expression


Researchers get viable mice by editing DNA from two sperm


With 1.2.2 update, Civilization VII tries to win back traditionalists


Ted Cruz can't get all Republicans to back his battle against state AI laws


Apple releases brand-new beta builds of all its flashy new Liquid Glass-ified OS updates


Canadian telecom hacked by thought China state group


Judge denies developing mass monitoring program harming all ChatGPT users


Microsoft surprises MS-DOS fans with remake of ancient text editor that deals with Linux


Crunch time—we’ll soon find out if Amazon’s launch providers are up to the job


Google brings new Gemini features to Chromebooks, debuts first on-device AI


Tesla launches robotaxi service in Austin


Sailing the fjords like the Vikings yields unexpected insights


Thales, Skydweller to Offer Solar-Powered Drone for Month-Long Patrols


Spotify's Daniel Ek Leads $690M Investment in German Drone Maker Helsing


DARPA to Demonstrate UAS VTOL Capabilities


Percepto Launches AI Emission Detector for Remote, Drone-Based Methane Surveying


Teledyne FLIR's Black Hornet 4 Nano-Drone Approved for Defense Innovation Unit's Blue UAS List


New DJI report highlights major drone privacy upgrades


DJI FlightHub 2 now works offline, keeps drone data regional


DJI Mini 3 drone sees unusual cost cut amid United States supply crunch


Every DJI drone is out of stock on the company’s online store


Inbolt to bring its real-time robotic guidance systems to the U.S., Japan


VC assesses robotics trade show season


Hexagon launches AEON humanoid robot for industrial applications


ByteDance Introduces Astra: A Dual-Model Architecture for Autonomous Robot Navigation


Databricks, Perplexity co-founder vows $100M on new fund for AI researchers


AllSpice's platform is the GitHub for electrical engineering groups


Wish to know where VCs are investing next Remain in the room at A Technology NewsRoom Disrupt 2025


Japanese shipping firm NYK acquires Kadmos, a salary payment platform for seafarers


How a college student got LHC data to play nice with quantum disturbance


France Commissions Five MALE Drone Demonstrators


Boeing, RAAF Demonstrate MQ-28 Teaming with E-7A Wedgetail


AeroVironment and UAS Denmark Partner to Advance Allied UAS Capabilities in Europe


Australia’s Drone Forge Orders Six Airbus Flexrotor Systems


Brighter Signals emerges from stealth


OpenAI pulls marketing materials around Jony Ive offer due to court order


Psyche keeps its date with an asteroid, but now it’s running in backup mode


New body size database for marine animals is a “library of life”


How a data center company uses stranded renewable energy


What drone should you buy right now [Summer 2025]


Baltimore man pleads guilty to flying over NFL playoff game


Levita Magnetics MARS surgical robot receives broadened FDA clearance


Voliro brings in $23M to accelerate inspection drone development


Black-I Robotics wins autonomous mobile robotic picking obstacle


Last day to minimize your A Technology NewsRoom All Stage pass-- prices increase tonight


2 days left to save up to $210 on your A Technology NewsRoom All Stage pass


A shark scientist reflects on Jaws at 50


Record DDoS pounds website with once-unimaginable 7.3 Tbps of junk traffic


Microsoft sets out its course to helpful quantum computing


MIT student prints AI polymer masks to restore paintings in hours


Male's health crashes after getting donated kidney-- it was riddled with worms


YouTube is hiding an excellent, official high-speed Pac-Man mod in plain sight


Rocket Report: Two big Asian reuse milestones, Vandenberg becomes SpaceX west


Longer business breaks lower the worth of ad-based streaming memberships


To avoid admitting lack of knowledge, Meta AI says guy's number is a business helpline


Study: Meta AI model can reproduce nearly half of Harry Potter book


IDS Imaging includes Sony Starvis 2 sensors to GigE uEye LE series


Applied Intuition raises $600M for autonomous driving tech


Rippling spy says men have been following him, and his wife is afraid


Cluely, a startup that helps ‘cheat on everything,’ raises $15M from a16z


The startups presenting of Europe's early-stage micromobility scene


Startups Weekly: Fast and furious


Could OpenAI fill Microsoft’s shoes


VanMoof is back with a new custom e-bike and rebooted repair network


Find out how Flexport’s CEO, Ryan Petersen, builds when the rules keep changing at A Technology NewsRoom Disrupt 2025


Boston Side Events lineup at A Technology NewsRoom All Stage with Fidelity Private Shares, Women Tech Meetup, Prepare 4 VC, and more


Startup hiring isn’t just about the pitch, it’s about the package — Pulley, 645 Ventures, and Epigram Legal break it down at A Technology NewsRoom Disrupt 2025


3 more days to fuel your next big move — and save up to $210 on your A Technology NewsRoom All Stage pass


Every fusion startup that has raised over $100M


Israel-tied Predatory Sparrow hackers are waging cyberwar on Iran’s financial system


SpaceX’s next Starship just blew up on its test stand in South Texas


Senate passes GENIUS Act—criticized as gifting Trump ample opportunity to grift


Smart TV OS owners face “constant conflict” between privacy, advertiser demands


Address bar shows hp.com. Browser displays scammers’ malicious text anyway.


After RFK Jr. overhauls CDC panel, measles and flu vaccines are up for debate


New dating for White Sands footprints confirms controversial theory


xAI faces legal threat over alleged Colossus data center pollution in Memphis


Via the False Claims Act, NIH puts universities on edge


Spanish blackout report: Power plants meant to stabilize voltage didn’t


Why Microsoft’s next Xbox should just run Windows already


Netflix will start showing traditional broadcast channels next summer


Google’s frighteningly good Veo 3 AI videos to be integrated with YouTube Shorts


Trump suggests he needs China to sign off on TikTok sale, delays deal again


Framework Laptop 12 review: I’m excited to see what the 2nd generation looks like


We’ve had a Denisovan skull since the 1930s—only nobody knew


2025 Audi S5 and A5 first drive: Five-door is the new four-door


Honda’s hopper suddenly makes the Japanese carmaker a serious player in rocketry


Silvus Technologies StreamCaster LITE 5200 Added to DIU Blue UAS Framework


GA-ASI Adds Saab Airborne Early Warning Capability to MQ-9B


First Airbus SIRTAP Prototype Ready to start Ground Tests


AeroVironment’s JUMP 20 Medium UAS Supports US Navy’s 4th Fleet During Operation Southern Spear


NATO Ukraine Stage Tech Trials for Intercepting Russia&s Cabled Drones


Paris Airshow Flying Display – Day 3 – June 18, 2025


Anduril and Rheinmetall to Develop Drones for Europe


France Signs Framework Agreement for Airbus VSR700 Programme


GA-ASI Autonomous Jet Demo Includes Successful Simulated Shoot-Down


Thales and Boreal to Produce Long Range Remote Operated Munitions


How to extend DJI Matrice 400 drone flight time


The craziest DJI sale continues, now for the Inspire 3


Drones with AI now spotting methane leaks across US basins


Reservoir Farms opens applications for inaugural cohort


All3 launches AI and robotics to tackle housing construction


Simbe, Coresight Research study finds retailers urgently need to reduce inefficiencies


Celebrating 200 podcast episodes; robotics evolve in space exploration


PrismaX launches with $11M to scale virtual datasets for robotics foundation models


A Technology NewsRoom Disrupt 2025: The Builders Stage agenda is now live and taking shape


Raising a Series C+ Cathy Gao’s bringing the real playbook to A Technology NewsRoom All Stage


At A Technology NewsRoom All Stage: VC red flags, founder signals, and pre-seed traps — Charles Hudson will tell you what investors really see


4 days left: Lock in your A Technology NewsRoom All Stage pass — or miss $210 in savings


Nvidia’s AI empire: A look at its top startup investments


Stripe’s former growth lead helps African diaspora invest in startups, real estate


Six-month-old, solo-owned vibe coder Base44 sells to Wix for $80M cash


Multiplier, founded by ex-Stripe exec, nabs $27.5M to fuel AI-powered accounting roll-ups


Midjourney launches its first AI video generation model, V1


Voi CEO says he’s open to acquiring Bolt’s micromobility business


Seed to Series C: What VCs actually want from AI startups


Scale smarter: 5 days left to save up to $210 on your A Technology NewsRoom All Stage pass


Grifin secures $11M to make investing less intimidating


Why one male is archiving human-made content from before the AI explosion


All 17 fired vaccine advisors unite to blast RFK Jr.’s “destabilizing decisions”


Cybersecurity takes a big hit in new Trump executive order


X takes legal action against to obstruct copycat NY material moderation law after California win


Trump's EPA to reconsider ban on cancer-causing asbestos


Prepare to bid farewell to The Sandman with S2 trailer


Toy-maker Mattel accused of planning “reckless” AI social experiment on kids


Google's Gemini AI household upgraded with stable 2.5 Pro, super-efficient 2.5 Flash-Lite


Sally commemorates complex legacy of very first United States female in area


OpenAI weighs “nuclear option” of antitrust complaint against Microsoft


The first Corvette hypercar Chevrolet's 1,250 hp ZR1X hybrid breaks cover.


Switch 2 users report online console restrictions after running individual game backups


Have we no embarassment : Trump's NIH grant cuts appallingly unlawful, judge guidelines


How Tesla Takedown got its start


The MacBook Air is the apparent loser as the sun sets on the Intel Mac era


Everything we know about the 2026 Nissan Leaf


Drones Take Center Stage at Paris Air Show 2025


GA-ASI Announces New PELE Small UAS for International Customers


MBDA Unveils ONE WAY EFFECTOR: a Solution to Saturate Enemies' Defences


Delair Unveils DT61 Very Long Distance Observation Drone at Paris Air Show 2025


Parrot Unveils CHUCK 3.0-- the Sovereign AI Autopilot for any UAV


DJI Dock 3, Matrice 4D drones just got smarter with new firmware update


DJI is coming for Insta360’s market share with the Osmo Nano


DJI issues urgent warning: 6 months till possible restriction


This security system unlocks DJI M400 drone flights over people


UBS signs up with Voliro's $23M push for self-governing drone examinations


Roboworx to aid Miso Robotics in installation, maintenance of its Flippy robots


Robot sales for the automotive market remain high in Europe


ANYbotics launches Gas Leak and Presence Detection for ANYmal inspection robot


After Shopify purchased his last startup, Birk Jernström wishes to assist designers construct one-person unicorns


Unlock purpose-driven growth at A Technology NewsRoom All Stage, and get $210 off for 6 more days


Applied Intuition raises $600M as it pushes even more into defense


In simply 3 months, Ramp's valuation leapt to $16B, up from $13B


Last call to volunteer at A Technology NewsRoom All Stage 2025


Own, a new social media app, aims to tokenize the creator economy


Sword Health nabs $40M at $4B valuation, pushes IPO plans to at least 2028


Two-year-old defense tech Mach Industries confirms $100M raise led by Khosla, Bedrock


Observability startup Coralogix ends up being a unicorn, eyes India expansion


Sequoia-backed Crosby launches a new kind of AI-powered law firm


New COVID alternative promptly gains ground in US; concern looms for summer season wave


Paramount drops trailer for The Naked Gun reboot


Worst hiding spot ever: /NSFW/Nope/Don’t open/You were Warned/


Vandals cut fiber-optic lines, causing outage for Spectrum Internet subscribers


Reddit user shocked when 1960s computer panel emerged from collapsed family garage


Trump fires commissioner of preeminent nuclear safety institution


Advertisements are rolling out gradually to WhatsApp


Trump Mobile launches, hyping $499 US-made phone amidst Apple dangers


F1 in Canada: Well, that crash was bound to happen


Founder of 23andMe redeems company out of personal bankruptcy auction


Here's Kia's new little, affordable electrical vehicle: The 2026 EV4 sedan


Nintendo Switch 2: The Ars Technica review


Future of Unmanned Airpower on Display at Paris Air Show


Jet and Quantum Systems Work on Ecosystem for Aerial Reconnaissance


Lockheed CEO says F-35 Could be Made Pilot-Optional


AI Drone Beats Human Champions at Abu Dhabi Racing Event


DJI updates D-RTK 3 with Matrice 400 assistance, GNSS upgrades


Republic Airways to evaluate an electric aircraft from BETA Technologies for regional flights


Wisk Aero signs agreements to develop autonomous air taxi operations in Miami and Japan


SS Innovations completes its first cardiac surgical treatment in the Americas with SSi Mantra


Premier Automation launches innovation hub in Pennsylvania


MIT Researchers Unveil “SEAL”: A New Step Towards Self-Improving AI


Researchers from PSU and Duke introduce “Multi-Agent Systems Automated Failure Attribution


Due date day: Startup Battlefield 200 applications close at midnight


The countdown starts: 7 days left to save up to $210 on A Technology NewsRoom All Stage passes


Alta raises $11M to bring 'Clueless' style tech to life with all-star financiers


Delightfully irreverent Underdogs isn’t your parents’ nature docuseries


Companies may soon pay a fee for their rockets to share the skies with airplanes


Biofuels policy has been a failure for the environment, new report claims


The “online monkey torture video” arrests just keep coming


These VA Tech researchers are building a much better fog harp


Google can now produce a phony AI podcast of your search engine result


Trump's FTC might impose merger condition that forbids marketing boycotts


There's another leakage on the ISS, but NASA is not saying much about it


Inside the firm turning eerie blank streaming ads into useful nonprofit messages


Another one for the graveyard: Google to kill Instant Apps in December


Meta beefs up disappointing AI division with $15 billion Scale AI investment


Ars Technica's present guide for Father's Day: Give dad some cool things


How to draft a will to avoid becoming an AI ghost—it’s not easy


Rocket Report: New delay for Europe’s reusable rocket; SpaceX moves in at SLC-37


MITRE and Gambit Partner to Advance Autonomous Systems for Homeland and National Security


Proven, Portable, and Perfectly Positioned: EC for Today’s Airspace


Renault Fire Rescue Prototype Launches Drones


Stark Flight Next-Generation Agricultural Drones to Revolutionize Farm Efficiency and Sustainability


Teledyne FLIR’s LVSS (Lightweight Vehicle Surveillance System)


How to rebind a used DJI drone: Step-by-step guide


farm-ng updates Amiga robot software application for small, midsize farms


Welding project uses robotics to crunch 12 workhours into 45 minutes


SCHURTER releases UHP high-current SMD fuse for humanoids


Sojo Industries raises $40M to scale mobile manufacturing lines


How robotics is changing the apparel industry


Coco Robotics raises $80M to scale walkway delivery robotics


The U.S. Navy is more aggressively telling startups, ‘We want you’


Alexa von Tobel has high hopes for ‘fintech 3.0’