From Robin

(Difference between revisions)

Jump to: navigation, search

Revision as of 16:13, 6 December 2016

Master thesis

Mulige oppgaver

Visualisering

Mål:

Forstå algoritemene til roboten bedre ved å visualisere dem under kjøring.
Lettere se effektene av parametertuning

Ting:

Sammenligne forskjellige modeller
Bruke faktisk kamerastrøm og overlaye genererte 3D-modeller

Visualisering ved hjelp av VR ser jeg på som litt unødvendig, ettersom VR mest antakelig ikke vil forbedre forståelsen veldig. Her kan 2D anvendes helt fint. For å finne ut om man har klart målet må man brukerteste systemet og se om brukerne syns det er lettere å forstå algoritmene. Dette kan være en utfordring ettersom man både trenger nok brukere (20?) og en godt designet undersøkelse.

Fjernstyring

Mål: Løse en oppgave med roboten enklere med hjelp av VR. Vise at det er lettere å se på obstacles eller å styre en robotarm med VR.

Utfordringen her er å bygge en robotarm. Å teste hvor effektivt systemet er er enklere enn visualisering ettersom man kan ta tiden det tar å løse oppgaven, eller nøyaktigheten den ble utført med. Her kan det være en ide å bruke Viven's håndkontroller for å styre robotarmen.

Visualisering gjennom AR

Bruke AR - optimalt gjennom briller - til å se robotens algorimter visualisert. Dette krever både gode AR-briller (Micorsoft hololens eller HTC Vive-kamera) og mye datakraft. Brukerens nøyaktige pose (posisjon og orientering i forhold til roboten må kunne holdes oppdatert). Jeg ser for meg at dette prosjektet kan løses i 2D på skjerm først, så VR, og så portes til AR.

Denne ideen gjør meg veldig hyped og er dette jeg vil gå for.

Mål:

Bygge et system som gjør det enklere å forstå algoritemene til roboten ved å visualisere dem under kjøring
Lettere se effektene av parametertuning

Utfordringer:

Vet ikke hvor bra AR fungerer med Viven. Det er ikke et stereokamera, noe som er litt kjipt. På en annen side vil en ipad - som også er et alternaltiv - også være mono.
Finne metrics på hvor bra resultatet ble.

Visualization of sensory data and algorithms through augmented reality

Metrics

How much better does the user understand how the robot works?

Things

Low latency on head mounted displays is very important for user experience.

Questionare

Rate traditional visualization (gazebo) 1-5 vs. AR system

The difficulty of understanding the robot's position in it's environment (easy to hard)
The difficulty of understanding the robot's orientation in it's environment (easy to hard)

Let's go

Ros -> Unity

Data from Ros can be sent over websockets with [ RosBridge]

Similar projects

ARSEA

ARSEA - Augmented Reality Subsea Exploration Assistant

Github project

This project includes writing pointclouds to file for testing, folder: /PointCloudManager. However, it does not get the points from the socket, but the infrastructure is there.

Turtlesim example

Example of turtlesim data into unity

Github projects

C# point cloud library

Sensor

RealSense

Ros package realsense_camera and features support for the f200 camera (Creative).

Installation

Clone librealsense and follow the installation guide. No need for point 4 or 5, as we put a symlink to librealsense in our catkin_ws/src dir and build with catkin.

Then clone realsense and put a symlink to the inner folder /realsense_camera in the catkin_ws/src dir. Build everything.

How to run

roslaunch realsense_camara f200_nodelet_default.launch

Example modified launch file: File:F200 nodelet modify params.xml

Into Unity

I have per 14/11/2016 found two ways to get data from this sensor:

CompressedImage

The first one is the official one which uses the CompressedImage message for sending depth data.

The data is published at the topic /camera/depth/image_raw/compressedDepth. I have not been able to decompress this in Unity using Texture2D, as the image is read as RGBA24. ROS says it is supposed to be 16UC1, and the ros package hints that it is png, guessing by the settings.

Update After two weeks, I finally managed to get this topic into unity, the first 12 bytes of the data packet is junk and have to be removed for the png to be decoded. However, if a compressed image is to be used, a point cloud has to be generated. This includes using the camera matrix for mapping the pixels to its appropriate line through the lens, into 3d space.

Mapping the depth image to 3D points Camera parameters


   D: [0.1387016326189041, 0.0786430761218071, 0.003642927622422576, 0.008337009698152542, 0.09094849228858948]
   K: [478.3880920410156, 0.0, 314.0879211425781, 0.0, 478.38800048828125, 246.01443481445312, 0.0, 0.0, 1.0]
   R: [1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0]
   P: [478.3880920410156, 0.0, 314.0879211425781, 0.025700006633996964, 0.0, 478.38800048828125, 246.01443481445312, 0.0012667370028793812, 0.0, 0.0, 1.0, 0.003814664203673601]

K means Calibration matrix. The calibration matrix maps points on the normalized image plane [x, y] to the pixel coordinates [u, v]. u = K*x. The matrix is 3x3 because it is homogeneous transformation. This transformation is the intrinsic part of the whole perspective camera model.

To calculate the 3D point in space, one must solve for x: x = K^-1 * u => x = K^-1 * u. Since coordinates on the normalized image plane are homogeneous, we can multiply them by our depth measure (z), to extend the length of the vector to the appropriate point in 3D space. x is [x/z, y/z, z], which gives the following result: [x, y, z] = K^-1 * u * z

K^-1

0.00209035	0	-0.656555
0	0.00209035	-0.514257
0	0	1

Mapping points depth points to the normalized

PointCloud2

The other one is an older(2015) implementation which instead uses the PointCloud2 message for sending depth data.

PointCloud2 live data packet example:


 header:
   seq: 537
   stamp: 
     secs: 1478272396
     nsecs:  68496564
   frame_id: camera_depth_optical_frame
 height: 480
 width: 640
 fields: 
   - 
     name: x
     offset: 0
     datatype: 7
     count: 1
   - 
     name: y
     offset: 4
     datatype: 7
     count: 1
   - 
     name: z
     offset: 8
     datatype: 7
     count: 1
 is_bigendian: False
 point_step: 16
 row_step: 10240
 data: [0, 0, 192, 127, 0, 0, 192, ...]
 is_dense: False

CompressedImage live data packet example:

The message is documented here

Datatype 7 means float32

The binary data array is of type uint8 and can be read like so:


 [      x      ][       y      ][      z       ][   garbage?  ][      x       ] ...
 0, 0, 192, 127, 0, 0, 192, 127, 0, 0, 192, 127, 0, 0, 128, 63, 0, 0, 192, 127,      ...

Each point uses 16 bytes, or places in the array. This means that the last 4 bytes, labeled "garbage?", can possibly be redundant and removed to save bandwidth.

Floats can be extracted like this:

float myFloat = System.BitConverter.ToSingle(mybyteArray, startIndex);

Plan

Visualizing sensor data in Unity

Point cloud is extracted from ROS trough rosbridge. Data type is JSON and infrastructure is web sockets. Unity will need to have an asynchronous thread to handle the data to avoid frame drops due to latency. The point clouds are data intensive.

Articles

http://gbib.siggraph.org/ - Library from SIGGRAPH - Computer graphics conference

Virtual environments, 17 articles - frontiers [1]

Challenges in virtual environments (2014) [2]

Virtual reality

Multi-sensory feedback techniques, user studies [3]

Telepresence: Immersion with the iCub Humanoid Robot and the Oculus Rift [4]

Virtual reality simulator for robotics learning [5]

Augmented reality

User testing on AR system Instruction for Object Assembly based on Markerless Tracking

Evaluating human-computer interface in general Evaluation of human-computer interface for optical see-through augmented reality system

Sensor Data Visualization in Outdoor Augmented Reality

Discusses the challenges with displays in lit environments[6]

Technical

Integration between Unity and ROS [7]

Unity: ROSBridgeLib [8]

Thesises

Teleoperation + visualization with oculus [9]

User studies on HRI
Bandwidth considerations

Remote Operations of IRB140 with Oculus Rift [10]

Controlling robot arm with stereo camera with oculus

Books

Comuter vision [11]

Internet

Camera matrix

@@ Line 107: / Line 107: @@
-To calculate the 3D point in space Since coordinates on the normalized image plane are homogeneous, we can multiply them by our depth measure, to extend the length of the vector to the appropriate point in 3D space.
+To calculate the 3D point in space, one must solve for '''x''': '''x''' = K^-1 * '''u''' '''=>''' '''x''' = K^-1 * '''u'''. Since coordinates on the normalized image plane are homogeneous, we can multiply them by our depth measure (z), to extend the length of the vector to the appropriate point in 3D space. '''x''' is [x/z, y/z, z], which gives the following result: [x, y, z] = K^-1 * '''u''' * z
- Inverse Calibration matrix:
+'''K^-1'''
 {| CLASS="wikitable"
 |-

User:Mathiact