I am Nivedita Rufus. I am an active researcher and very passionate about Robotics. So, owing to my passion I primarily work with โSelf-Driving Carsโ which is also part of my research work at IIIT- Hyderabad, India. I am also a GSoC'20 intern at RoboComp and will be working on the project โSoftware agent for estimating occupancy in medium and large buildings using RGB camerasโ. This blog will be mostly about my GSoC journey.
I had decided that I want to participate in GSoC 2020 sometime around September 2019. This is quite early ๐, considering that there were still five months for the organizations to be announced. But, I still went through the organizations that participated in GSoC, 2019. I knew that I would be very comfortable working with robotics-related projects. So, I looked through the organizations which had projects in this field. I also followed through and tried to understand as much as I can the projects undertaken by each of the organizations.
This post is about my very first bug fix in RoboComp’s human-detection repository. Everybody starts with a very small bug, so did I. I found this issue on March 16(before the student application deadline) when I was trying to get the PeopleCounter module working on my PC. I realized that there was a line missing in the config file that hosts the People Server. I also found that a path to MobileNetSSD_deploy.
I found this issue because I was still unable to get the PeopleCounter component working. It raised an import error there where some functions imported from the resources directory, which is outside the working directory. My commit ensures that the python modules are imported properly. I had made a pull-request for the same which is now merged.
Status: Merged
I had submitted a pull-request to update the file specificworker.py in the component openpifpafserver. These added lines are some parameters which are in the file and not specifying these values lead to Attribute errors. I initialized all these parameters to the default values specified in the openpifpaf package. I believe this should solve the issue.
Status: Merged
After considering declining support for Python 2 programming language and added benefits from upgrades to Python 3, it is always advisable for a developer to select Python version 3. So, from the statement above, it is pretty much obvious that the first task assigned to me was porting the People Counter module from Python 2 to Python 3 which is done. Changes made are:
Changed all print statements from print " " to print().
So the project phase is going to start in few days and I am so under the pump ๐. I had promised my mentors that I will look through existing methods that can be used to count and track people and find out which approach works better by testing all of them. The methods I want to look at are:
CSRnet SS-DCNet Deep SORT
I am in the process of setting up necessary environments for each of the methods.
The project phase has started. I have made a standalone python project to count people which will be my baseline to test different methods. You can find it here. This is a refactored version of the existing PeopleCounter module of RoboComp. Once, the most suited method is chosen, I will integrate it with RoboComp architecture which forms the final stage of the project.
A short demo So there, two ways to count people inside a building:
In my earlier post, I had mentioned two approaches to count people. So, I have gone through the proposed methods namely:
CSRNet SS-DCNet Deep-SORT As I have already mentioned, CSRNet and SS-DCNet are suited to cases when we have footage that is from within the room by keeping a running count of the number of people in the room frame-by-frame. Both these networks work well for densely crowded places.
I have successfully completed the first milestone of my GSoC which was to count people present in the scene given a video feed using existing deep Net models. In my earlier post, I had mentioned that I will be going foreward with the SS-DCNet model due the advantages which I had discussed. I have used the models that were trained the Shanghai Dataset(A, B) and QRNF dataset by the authors of the SS-DCNet paper(ICCV 2019).
In my earlier post, I had mentioned that the people counter based on the SSDCNet model fails sometimes when the place is very scarcely populated. So I compared its performance with a counter based on the SSDNet model. Given below is an example of footage of 18 people (SALSA dataset). With SS-DCNet based counter With SSDNet based counter We can see that the SSDCNet based counter performs better in comparison with the SSD based approach.
This will be my last post for the first phase of my GSoC journey. I had promised to deliver a working module that returns the count of people from a single view through a continuous video feed. From my earlier post, you might have seen the challenges I had faced and the solutions I had proposed. I had mentioned that use of filtering techniques may help smooth out unrealistic jumps in a continuous video feed, I had resorted to two major methods which proved to improve the performance of the count values return with the SS-DCNet model namely,
I so pumped up right now because I passed the first evaluation and will be getting paid soon ๐๐
. This has been a wonderful journey till now and I have no idea how the time passed so fast. Someone once said, “Do what you love, you never have to work a day in your life”, I guess I have seen the truth to this statement. I am extremely grateful to my mentors for giving me the freedom to explore many things in this endeavor and without even knowing it I have learned so much ๐.
In one of my earlier posts, I had mentioned the preprocessing step for counting the number of people from multiple partially overlapping views. This step involves stitching of the images together to form a single image which is then passed into the counting module (code). This might result in a performance overhead compared to the normal case, this can be balanced by adjusting the skip_frames parameter in the code. The stitching algorithm used here is based on this publication.
In my previous post, I had mentioned a stitching algorithm that was supposed to be robust and insensitive to illumination changes. In this post, I verify the same. So I have implemented the standalone image stitcher here, where I change the illumination randomly(make the image dark, bright, hazy, etc) to verify the robustness of the stitcher.
Given below is the result of the stitched video where each frame is randomly subjected to illumination changes (bright, dark, Haze):
This is my last contribution to the second phase of my GSoC journey. Here, I have implemented the Boyer-Moore Majority Vote algorithm for video feeds from multiple cameras from different perspectives, i.e. for views shown below. (Image Source: SALSA dataset) Camera: 1 Camera: 2 Camera: 3 Camera: 4 As visualized from these images, stitching is not possible in this case as their perspectives are different.
I am in the last leg of my GSoC journey. This has been an incredible experience so far. The amount of learning I have done is indeed incredible and is something to be really grateful for. Initially, I was quite anxious if would be able to deliver what I had promised. Now it looks like I would be able to more than that ๐๐
.
I am really looking forward to reaching the finish line smiling but at the same time a little sad that it would bring my GSoC journey to an end.
I feel really great as I am almost done with integrating the code into the RoboComp architecture. There are still a few things left like documentation and maybe a little code clean up. Over to the details.
The ability to configure an application’s properties externally provides a great deal of flexibility. One can use any combination of command-line options and configuration files to achieve the desired settings, all without having to modify source code of the application.
“If you do what you love, you never have to work in your life."
I feel so good at the same time a little disappointed that my GSoC journey is in the wrap-up phase. It was my privilege to work for RoboComp. I started out not knowing much, but in these three months, the amount of learning that happened was exponential. This is probably the most productive summer I have ever had.
In the final phase of GSoC 2020, I was given a very interesting problem which was I was done with most of the tasks that I was given: to estimate people’s positions using a reference system that originates, for example, from the recording camera itself and also provide an estimate of their orientation. This looks like a simple problem over the top but has its challenges. We would have to do this estimate the pose from a top-down view, i.