Abstract : Multi-Modal Interface Systems (MMIS) have proliferated in the last few decades, since they provide a direct interface for both Human Computer Interaction (HCI) and face-to-face communication. Our aim is to provide users without any prior 3D modelling experience, with a multi-modal interface to create a 3D object. The system also incorporates help throughout the drawing process and identifies simple words and gestures to accomplish a range of (simple to complex) modeling tasks. We have developed a multi-modal interface that allows users to design objects in 3D, using AutoCAD commands as well as speech and gesture. We have used a microphone to collect speech input and a Leap Motion sensor to collect gesture input in real time. Two sets of experiments were conducted to investigate the usability of the system and evaluate the system performance using Leap Motion versus keyboard and mouse. Our results indicate that performing a task using speech is perceived exhausting, when there is no shared vocabulary between man and machine, and the usability of traditional input devices supersedes the usability of speech and gestures. Only a small ratio of participants, less than 7% in our experiments were able to carry out the tasks with appropriate precision.