RoboScience CEO Ye Tian: A disciple of Andrew Ng armed with Apple’s product philosophy, imbuing robots with a sense of “life”
Release Date:
2025-10-14 17:26
Source:
The following article is sourced from Cyzone, by author Lucas
Author | Lucas
Editor | Liu Hengtao
Image source | RoboScience
On the cutting edge of embodied intelligence, RoboScience founder Ye Tian brings with him his experience from Stanford and Apple. He is striving to define embodied robotics through a more fundamental mode of thinking.
This entrepreneur, who has been deeply immersed in artificial intelligence for many years, earned his undergraduate degree from the Department of Physics at the University of Science and Technology of China, then went on to pursue advanced studies at Stanford University’s AI Lab under the mentorship of Professor Andrew Ng. After graduation, he joined Apple, where he quickly rose to become the technical lead of the Apple AI Platform. He led his team in building a core platform hailed as “Apple’s PyTorch and CUDA” and has spearheaded the large-scale deployment of numerous key AI technologies across Apple’s ecosystem and the App Store, laying the foundation for Apple’s AI architecture. Today, he is leveraging these core platforms and pivotal AI capabilities in the realm of embodied intelligence, forging a wholly new product portfolio and ecosystem for the industry.
Deeply influenced by Apple’s product philosophy, RoboScience adheres to an “integrated hardware‑software” approach. The team is developing robotic systems and proprietary core components, targeting B‑to‑B applications such as logistics and commercial services, while also envisioning consumer‑grade products that serve as trusted “partners” for end users.
RoboScience is accelerating technological innovation and real-world deployment, with the vision of becoming a leading brand in the field of embodied intelligence within five years—empowering robots with “life” through technology and driving the industry from exploratory research toward large-scale value creation.
On the capital front, RoboScience closed its angel round in July this year, led by JD.com, with participation from China Merchants Capital and SenseTime Guoxiang Capital, while existing investor 01 Venture continued to invest.

In the face of intense industry competition, what are RoboScience’s core strengths, what strategic plans does it have for the future, and how does it view industry trends? Below is the response from Tian Ye, Founder and CEO of RoboScience.
Stanford’s academic imprint, Apple’s product soul
Cyzone: You studied physics at the University of Science and Technology of China in your early years, and later joined Professor Andrew Ng’s AI Lab at Stanford. What led you to choose this academic path?
Ye Tian: From an early age, I have been deeply fascinated by knowledge across diverse fields—such as biology, zoology, paleontology, and the history of anthropology—driven by a desire to grasp the underlying principles that govern the world. During junior and senior high school, mathematics and physics became my favorite subjects. In my undergraduate years, I majored in physics while also taking courses in mathematics and computer science. I aspired to gain a comprehensive understanding of how both society and the natural world operate, even though at the time I was still uncertain about what specific contributions I might make.
In 2012 and 2013, I learned that Professor Andrew Ng talking about Google Brain's achievement known as “Google Cat”—in which a neural network learned to recognize cats without any supervision by watching YouTube videos—left me profoundly impressed and convinced that AI is the path to creating intelligence. From that moment on, I aspired to dedicate my career to using technology to build intelligent systems.
Later, I went to Stanford and had the privilege of conducting research alongside Professor Andrew Ng.

Cyzone: How did this academic experience shape your subsequent career and your perspective on technology? What was it like to study under the AI giant Andrew Ng?
Ye Tian: Studying under Professor Andrew Ng has been an invaluable experience. He is profoundly visionary, believing that AI will become the “electricity” of the new era—empowering every industry and product, from manufacturing to everyday life. He underscores the critical importance of technology; Google’s Cat Project, for instance, exemplifies how neural networks can autonomously acquire knowledge. At the same time, he places great emphasis on engineering, advocating the construction of robust neural networks integrated with massive clusters and vast amounts of data.
This experience has had a profound impact on me, It has made me realize that developing AI requires both a long-term vision and solid engineering expertise. Thoughts require a deep understanding of infrastructure, tools, and the construction of AI systems.
Cyzone: When you joined Apple, the company’s AI technology ecosystem was still in its early stages. Could you share what the initial conditions were like when you first became involved in Apple’s AI projects?
Ye Tian: When I joined Apple, the company’s AI capabilities were still in their infancy, with only a handful of applications—such as facial recognition—already in use. Even so, Apple had already committed to investing heavily in AI research and development, for example by designing the Neural Engine (NPU) chip, and several small teams were experimenting with integrating AI into its products, though these efforts lacked systematic support. What I set out to do was, in essence, enablement: building a platform that would make it easier for AI developers to turn their ideas into high‑quality products.
I led my team in building an on-device machine learning platform, primarily accomplishing two key tasks: First, we have built a model evaluation and training system that not only measures model accuracy but also assesses its integration with Apple’s ecosystem. Since the neural network engine chip has not yet been manufactured, we have developed a simulator on the network side. web service We tested the model architecture and its power consumption and speed on Apple’s custom silicon, and the platform ultimately became the company-wide standard for model development and evaluation. Second, we developed an on-device inference engine, enabling neural network models to run on Apple devices such as iPhones, Apple Watches, and Macs. Almost all of Apple’s AI features—such as facial recognition in the camera, depth‑of‑field algorithms, and Siri’s speech recognition—are the result of this kind of work. It’s truly rewarding. For example, some of my elders aren’t very comfortable using input methods. I showed them that wherever they need to type on their phones, they can simply rely on our offline speech‑recognition system. I’ve actually seen their daily lives become a bit easier thanks to the products I helped create.
In addition, we have built a robust ecosystem. For example, Apple boasts a large and vibrant developer community, and we offer different development frameworks tailored to developers’ varying skill levels. For developers who are proficient in AI, we provide a general-purpose framework that enables them to easily deploy models on Apple devices; for those who prioritize user experience but lack AI expertise, we offer specialized frameworks that let them leverage powerful AI capabilities out of the box; and for developers in between, we deliver flexible, customizable solutions.
On a spiritual level, Apple’s developers are deeply committed and highly aligned with the company’s values. This is due in part to Apple’s strong product platform, and also to our proactive efforts to foster a vibrant developer community. For example, we host an annual Worldwide Developers Conference, maintain a developer forum for regular engagement, and even invite developers with unique needs to collaborate directly at the company.
Cyzone: We’ve noticed that you played a pivotal role in building Apple’s AI product ecosystem and its broader AI‑application ecosystem, serving over one billion users. Could you share the most important product‑development insights you gained while constructing this vast ecosystem?
Ye Tian: Apple is the world’s leading product company; in both product innovation and technological prowess, few firms can match it.
Apple’s core strength lies in treating each user as an individual and putting their experience first. This respect for users fosters an emotional connection between them and the device. For example, with the “Memories” feature we introduced in Photos, the system subtly and intelligently curates edited videos of past moments, making users feel that their life’s memories are being thoughtfully preserved. This kind of gentle, almost imperceptible care seamlessly integrates people into the ecosystem.
Product design and R&D emphasize systems thinking rather than isolated breakthroughs. Take the latest iPhone’s large‑model capabilities as an example: this requires a holistic consideration, spanning from the underlying chip computing power, memory management, and thermal control, through the mid‑tier software architecture design, to the seamless integration at the top layer with the application ecosystem.
The advantage of systems thinking is that it enables us to efficiently integrate both internal and external best-in-class technologies, ultimately converging them into a unified product. To achieve this, we adopt proven vendor‑provided solutions while independently developing the core components.
Cyzone: During your seven years at Apple, you grew from an engineer to a technical leader. What insights have you gained in managing teams?
Ye Tian: I believe team management is a science that requires systematic thinking to address challenges. When leading a team, I place the greatest emphasis on two key dimensions: solid technical fundamentals and genuine passion. A manager must inspire team members’ sense of purpose and enthusiasm for the mission in order to sustain the team’s vibrant creativity.
What left a deep impression on me was our experience developing the vision system in 2018 and 2019. At the time, we needed to implement multi‑object detection and segmentation, but the relevant technologies were still immature on mobile devices, and some network architectures weren’t even supported by the neural engine. A team member who excelled in algorithms but had limited familiarity with hardware and low‑level software nonetheless took the initiative to dive deep into the technical details, driven by his passion for the field. Together, we developed a novel compiler technology and ultimately succeeded in deploying a complex, high‑dynamic‑range network onto mobile platforms.
When venturing into uncharted territory, it’s exceedingly difficult to impose top-down plans on team members. Instead, we should foster an environment and provide the support that empowers each member to tap into their own potential. Those who proactively take initiative and push forward often deliver greater value to the team. This genuine appreciation for technical passion ultimately shines through in product quality, allowing users to sense the creators’ dedication.
Cyzone: When you were at Apple, as a pioneer with no role models to learn from, did you rely on self‑exploration or on identifying and addressing specific needs?
Ye Tian: First of all, in virtually any field today, we never start from scratch. In the realm of AI, Stanford and Silicon Valley already boast numerous pioneers—among them my mentor and Shao Lin’s mentor—who have all made crucial foundational contributions.
Our core objective was to ensure the organization operated efficiently and to inspire more talent to get involved. At the time, AI expertise was in short supply, so the key challenge was how to help those eager to enter the field but lacking deep experience grow rapidly, while effectively coordinating team efforts to co‑create new products. This was the primary reason I strongly championed building an AI platform.
From the perspective of product definition, the requirements of many application scenarios are readily apparent. For instance, the implementation of facial recognition and voice‑interaction features is a natural extension of AI capabilities. Ultimately, AI is artificial intelligence: many aspects of human intelligence, which prove valuable in humans, can likewise be leveraged in artificial systems.
Many of these demands don’t stem from market research; they arise from a deep understanding of users’ fundamental expectations. Just as people naturally expect their smartphones to deliver the kind of image quality once reserved for DSLR cameras, our mission is to turn that aspiration into reality through AI technology.

Tian Ye (third from the left) and his colleagues at an event hosted by Apple.
The “GPT moment” for embodied intelligence will arrive within the next five years.
Cyzone: To many, the path from Stanford to Apple seems like an ideal career trajectory. What prompted you to leave and return to China to start your own venture? And why did you choose the field of embodied intelligence?
Ye Tian: I’m from Zigong, Sichuan. Zigong is known as the “Dragon Capital” because it’s a major producer of fossils—especially dinosaur remains. As a child, I spent countless hours at the dinosaur museum, where exhibits towered several stories high. Sauropod dinosaurs Stegosaurs with spiky backs, pterosaurs soaring through the skies, and plesiosaurs gliding in the waters—just a quick science note: the latter two aren’t dinosaurs (laughs). These creatures that once roamed the Earth have made me realize how vibrant and diverse our world truly is, while planting in me an aspiration: Could human ingenuity make the world even richer and more beautiful?
Later, my academic and professional pursuits gradually converged on the field of AI, as it is a quest to create intelligence through human ingenuity. The diverse AI products I developed at Apple have genuinely delivered substantial value to users around the world. But I have always hoped to create intelligent agents that are more attuned to the natural rhythms of human life.
This brings me to my good friend, Lin Shao. Shao was my classmate during my time at Stanford. He has long been engaged in scientific research on embodied intelligent robots, while I have focused more on AI across various domains in the digital world. We both believe that physical robots endowed with general intelligence represent the ultimate goal of our work. An early example dates back to 2020, when Shao Lin and I met at a small bistro near Stanford to discuss how to enable natural-language‑based interaction and manipulation with robots—work that eventually culminated in the paper “Concept2Robot.” That paper was among the earliest contributions to what is now the highly popular VLA (Vision‑Language‑Action) field.
Subsequently, the generalizability of language models surged with the advent of ChatGPT. Both academia and industry have mounted numerous efforts to replicate the scaling laws in the robotics domain. After extensive deliberation and discussion, we have ultimately defined what we consider a viable, general-purpose technical roadmap—namely, VLOA. I told him, “Let’s create intelligent robots together.” And so, we embarked on our entrepreneurial journey.
And the reason I decided to return home is that I saw a faster pace of development.
Objectively speaking, Silicon Valley in the United States is a region where technological resources are highly concentrated, bringing together numerous tech companies, universities, and research institutions, and fostering a remarkably dynamic culture of innovation. There, many individuals pursue their careers out of passion, exhibiting a strong entrepreneurial spirit and proactive mindset, continuously driving the development of cutting-edge technologies.
Meanwhile, China also boasts world-class conditions for innovation. The country is brimming with vitality: as night falls, the streets remain bustling, and one can see everywhere people working hard and pursuing their endeavors with determination.
For example, from an industrial‑ecosystem perspective, China’s Greater Bay Area boasts a well‑developed supply chain, with a dense concentration of robotics companies. From components and testing resources to software support, the ecosystem is exceptionally robust, offering us tremendous convenience. When iterating on our robotic hardware, we can often source suppliers within a ten‑minute drive. This is critical for startups: in an environment of uncertainty, they must iterate rapidly and swiftly bring new products to market in order to continually push the boundaries.
Cyzone: What do you see as the similarities and differences between Apple’s products and robots?
Ye Tian: I believe Apple products—especially smartphones—are highly similar to robots: both are intelligent devices that receive external input and generate output after processing it. Yet, from the perspective of their overall software‑to‑hardware architecture, they share many commonalities. Moreover, while a smartphone is each person’s “personal device,” future robots should become everyone’s “personal companion.” You want them to fully understand you, never betray you, and keep your data strictly confidential. You also expect them to possess a certain sense of “life”— Just as we expect our smartphones to respond with lightning speed, a robot’s immediate feedback upon receiving information is a direct manifestation of that “sense of life.”
Meanwhile, Apple’s two monumental successes both stemmed from transformative shifts in interaction paradigms. The first was the widespread adoption of graphical user interfaces through the Macintosh, and the second was the popularization of touchscreens with the iPhone. Yet both of these innovations featured contactless output. By contrast, embodied intelligence will enable contact-based output—meaning that robots can engage in tactile interactions with humans and their physical environment. As a classic adage in media studies goes: “The medium is the message.” I believe that the innovations in interaction paradigms brought about by embodied intelligence will represent another major breakthrough.
Cyzone: In your view, what is the biggest bottleneck currently hindering the development of embodied intelligence? What are RoboScience’s core competitive advantages?
Field: I believe this is a nascent industry, with players still in the exploratory phase and no clear consensus on the way forward. Many have adopted methodologies inspired by large-scale models, hoping that one day general-purpose capabilities will begin to emerge.
Our strength lies in our deep reflection on the essence of embodied intelligence and in how to achieve superior performance with less data. Currently, many solutions in the industry rely heavily on real-device data collection, but the sustainability of this approach is questionable. We have estimated that an operator can collect at most 200–300 valid data points per day, whereas achieving true intelligent generalization requires a data volume that exceeds this by several orders of magnitude. This reliance on manual data collection poses significant challenges in terms of both data scale and time costs.
The VLOA model architecture proposed by our team hinges on a core innovation: it captures the essential nature of robot–physical-world interaction. Take the action “moving a cup from the tabletop to in front of me” as an example: the key lies not in who performs the action, but in the description and planning of the object’s trajectory itself. By generalizing task planning, we can leverage vast amounts of video data as training resources, thereby reducing our reliance on real‑world hardware‑based data.
Secondly, it is necessary to address the question of “how to execute,” namely, enabling any robot to manipulate any object to accomplish a task. This is akin to the way infants learn: by instinctively applying forces to various objects and observing the resulting changes in their states, they abstract general physical laws.
We can save on manpower and resources by collecting real-world robot data, allowing us to focus our core efforts on large‑model algorithms. When we took this approach last year, it still seemed rather unconventional, but now even Musk has suggested using video‑based training to replace real‑robot data, which shows that the industry is moving in the same direction as we are.
Cyzone: In which sectors are the most likely real-world applications emerging?
Ye Tian: I believe that the real-world applications of embodied intelligence are brimming with potential. As we mentioned earlier, the ultimate goal is to achieve “generalization,” enabling any robot to manipulate any object and perform any task.
At present, the scenarios that are easiest to implement are those with well-defined tasks and relatively stable environments. For example, in logistics warehouses or commercial service settings, robots must handle a wide variety of objects, but their core operations are largely bounded—tasks such as picking up, stacking, and arranging items. These environments are clearly defined, making the technical implementation comparatively straightforward.
In contrast, entertainment‑oriented robots or running robots do not need to interact with complex objects; their tasks are relatively simple, and the technical challenges are somewhat lower.
The true advantage of embodied intelligence lies in overcoming the bottlenecks of current automation. Traditional solutions require re‑tuning for every new scenario, resulting in high costs and long development cycles. In contrast, our system boasts strong generalization capabilities, enabling it to adapt rapidly to new environments and objects, significantly reducing deployment expenses while enhancing flexibility in handling complex scenarios. This is what truly constitutes an “intelligent upgrade.”
Cyzone: How do you envision the future development of embodied intelligence? Are there any pivotal turning points akin to the “GPT moment,” and when might they occur?
Ye Tian: The “GPT moment” of embodied intelligence is indeed difficult to measure by a single standard, but we can examine it from two dimensions.
The first is the technological dimension. At the height of its emergence, ChatGPT’s cognitive capabilities may have been comparable to those of a 10-year-old child, whereas embodied intelligence hinges on dexterity and operational skills. A two- or three-year-old can use a spoon, a four- or five-year-old can wield scissors, and by age six, they’re already capable of writing and performing fine‑motor tasks. I believe we can set similarly early benchmarks for robots, since physical development typically outpaces brain development. Once a robot attains the manual dexterity of a five- or six-year-old, it will have achieved true generalization from a technical standpoint.
The second is the product dimension. A key hallmark is that ordinary users can get up and running within five minutes and find the value it delivers exceeds its cost. Just as ChatGPT once made conversational AI accessible to everyone, if a complete beginner can instruct a robot to perform tasks in five minutes—and do so more affordably than other alternatives—then that product has truly taken off.
From these two perspectives, I believe the “GPT moment” for embodied intelligence will arrive within the next five years.
Cyzone: Looking globally, what kind of competitive and collaborative landscape do you foresee in the field of embodied intelligence over the next three to five years?
Ye Tian: Embodied intelligence goes far beyond a single product; it is a strategic industry poised to reshape the social landscape, much like smartphones or automobiles, giving rise to robust industrial clusters. For this very reason, I believe the sector is unlikely to be dominated by a single monopolist. Instead, it will feature an exceptionally long value chain, with numerous players at every stage. The final products, too, will reflect a rich diversity of offerings.
The potential of robots is immense. These days, many people are focused on humanoid robots, but who’s to say we can’t have dinosaur‑shaped ones too? (laughs.) And Doraemon isn’t humanoid either, right? Robots in all sorts of forms will shine in different settings. We should think more broadly—since this is an era that offers us unprecedented creative opportunities, there’s no need to confine our imagination.
From a national perspective, it is more likely that a “dual-engine” pattern of cooperative competition—led by China and the United States—will emerge. China’s core strengths lie in its complete industrial chain and vast market depth, enabling large-scale, universally accessible commercial applications. By contrast, the United States is likely to focus on the high-end segments of the value chain, developing premium, high‑value‑added solutions aimed at the高端市场. Other regions may find it difficult to establish an independent third pole, but they can participate through cooperation with the Chinese and U.S. ecosystems, serving as key partners in application supply chains and markets, thereby helping to shape the global landscape.
Endow robots with a sense of “life,” making them humanity’s “faithful companions.”
Cyzone: How did you and your co-founders come together? How does the team’s diverse background—spanning academia, product development, and other areas—complement one another?
Ye Tian: We currently have four co-founders, two of whom have been my close friends for many years.
Shao Lin is my close classmate and dear friend from Stanford University. We enrolled together in 2014 and both pursued advanced studies in the Artificial Intelligence Laboratory—while I focused on AI, he specialized in embodied robotics.
We’ve known each other for many years, built a strong foundation of trust, and complement one another in personality—he’s more meticulous and steady, while I’m more outgoing and energetic. We also bring complementary expertise: he has long focused on the low-level systems of robotics, with an emphasis on the scientific side, whereas I concentrate on the AI layer, leaning toward engineering implementation.
Another co-founder is my longtime friend, Wang Tao. We were undergraduates in the same year at the University of Science and Technology of China. Although we weren’t in the same department, we played on the same grade‑level soccer team, with him serving as our captain. Prior to this, he worked in the investment industry, gaining deep expertise in venture capital and private equity for tech startups—covering fundraising, internal operations, and strategy—and has made numerous investments. I believe his skills complement those of our entire team exceptionally well.
Another co-founder, Liu Penghai, is a veteran of the industry with over 20 years of experience in hardware R&D, management, and supply chain. He previously worked at Ecovacs, one of the largest robot manufacturers by shipment volume. I met him last year, and we quickly became good friends. We share many commonalities, from technical complementarity to product vision.
Cyzone: Will RoboScience also develop its own robot hardware?
Ye Tian: Although many robots on the market today are humanoid, their internal architectures vary widely, and the requirements for each application differ significantly. In particular, the “hand”—that is, the end effector—has distinct needs depending on the specific scenario.
So, at the outset, we’ll lean heavily on integration, but we’ll gradually develop our end-effectors and other core components in-house, with a strong focus on delivering the best possible user experience. Ultimately, we must optimize both software and hardware together—much like Apple’s approach: only by tightly integrating software and hardware can we deliver a truly exceptional experience.
Beyond chips and sensors like those found in smartphones, the powertrain is equally critical. In most cases, off-the-shelf solutions suffice, but for certain specialized requirements, we either develop custom solutions ourselves or collaborate closely with partners across the industry chain to refine them. Of course, we build robots in-house, but we don’t tackle everything on our own. Ultimately, product‑driven innovation—delivering an exceptional user experience—requires seamless integration and continuous refinement of both software and hardware.
Our algorithm has a unique advantage: it enables rapid cross‑robot transfer, allowing the same model to support robots of different forms. However, we won’t indiscriminately cover every use case; instead, we’ll focus on core scenarios to deliver a seamless user experience. As for other use cases, we’re eager to collaborate openly and work with all robotics companies to expand the ecosystem.
Cyzone: What are your business‑level considerations and plans for the next five years?
Ye Tian: Our development will proceed in several phases: at this stage, the priority is to refine the model’s capabilities to their fullest potential while piloting it in a limited set of use cases. POC Validation—putting the robot through real-world testing—is the top priority from this year through next. Following that, we’ll deeply integrate our in-house‑developed robots with our algorithms and ensure seamless operation in real‑world scenarios.
In the long term, we need to pursue both B2B and B2C strategies. Especially in the consumer‑facing space, robots shouldn’t merely be tools—they should become companions that seamlessly integrate into everyday life and truly understand you. Many people say that dogs are humanity’s most loyal companions. In the future, robots will likewise serve as faithful, intelligent, and capable partners—this is precisely the vision we ultimately aspire to realize.
Drawing on Apple’s experience, building ecosystem barriers can be divided into three tiers: The first layer is ensuring that the product itself delivers an exceptional user experience; the second layer is building a vibrant ecosystem of hardware, software, and developers; and the third layer is brand building, so that when users think of “embodied intelligence,” they naturally associate it with our brand, “RoboScience.”
Although there is some uncertainty in the specific timeline, we will focus on rapid iteration and steady progress. In the first phase, we will launch several market‑validated products to build a core user base; in the second phase, we will gradually expand our user base, develop a comprehensive ecosystem, and attract developers and partners to join; and in the third phase, we will prioritize strengthening our brand influence. We plan to complete this three‑phase roadmap within five years.
The robotics ecosystem will be even broader than the mobile‑phone one, because every user can become a developer—you can directly teach your robot new tasks. Technically, most of its capabilities will run on the device itself. This is key to giving robots a sense of “life”—that is, independent, rapid responses and actions—and it will also keep your secrets safe. In other words, it should “act independently” like a living organism, while also being a friend you can absolutely trust.
END.
Top News