Detecting soccer teams using unsupervised learning and tensorflow object detection (images and videos)

In the past we have used Tensorflow Object Detection to detect sharks, social distancing and squirrels. Detecting objects is fun and we can build on top of that. Our main task will be to detect the two teams on a soccer field. We will use Tensorflow Object Detection to detect the people and then we’ll use unsupervised learning to cluster the people objects based on their shirt color. We’ll use k-means to cluster the people objects.

We’ll start with the regular Tensorflow Object Detection sample. After that we’ll follow some steps to build our little project.

This will be our end result:

First thing we’ll need to do is modify the method: visualize_boxes_and_labels_on_image_array . This will allow us use a different bounding box color for each team. Although we need to copy-paste the whole method, the change is pretty small:

        '''
        if agnostic_mode:
          box_to_color_map[box] = 'DarkOrange'
        elif track_ids is not None:
          prime_multipler = _get_multiplier_for_color_randomness()
          box_to_color_map[box] = STANDARD_COLORS[
              (prime_multipler * track_ids[i]) % len(STANDARD_COLORS)]
        else:
          box_to_color_map[box] = STANDARD_COLORS[
              classes[i] % len(STANDARD_COLORS)]
        '''
        box_to_color_map[box] = STANDARD_COLORS[team[i]]
        

We commented a lot of stuff and assigned the color based on a team array that contains different numbers for each team.

Then we’ll have our main method which will let us detect the teams. At a high level this method performs the following steps:

  • Performs object detection and filters people
  • Processes the coordinates to feed them into the k-means
  • Use k-means to find clusters
  • Displays the images with the teams detected
def detect_team(model, frame,df):
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  
  person_class = 1
  original_image = frame
  
  image_np = frame
  # Actual detection.

  output_dict = run_inference_for_single_image(model, image_np)

  boolPersons = output_dict['detection_classes'] == person_class
  output_dict['detection_scores'] = output_dict['detection_scores'][boolPersons]
  output_dict['detection_classes'] = output_dict['detection_classes'][boolPersons]
  output_dict['detection_boxes'] = output_dict['detection_boxes'][boolPersons]

  r_points = []
  b_points = []
  g_points = []    


  for i in output_dict['detection_boxes']:
    new_box = denormalize_coordinates(i,original_image.shape[1],original_image.shape[0])
    im2 = original_image[int(new_box[0]):int(new_box[2]),int(new_box[1]):int(new_box[3]),:]
    r_points.append(im2[:,:,0].mean())
    b_points.append(im2[:,:,1].mean())
    g_points.append(im2[:,:,2].mean())

    new_row = {'R':im2[:,:,0].mean(), 'G':im2[:,:,1].mean(), 'B':im2[:,:,2].mean()}
    df = df.append(new_row, ignore_index=True)

  #print(df.shape)
  if len(output_dict['detection_boxes']) > 1:
    kmeans = KMeans(n_clusters = 2, init = 'k-means++', max_iter=1000, n_init = 100, random_state=0)
    y_kmeans = kmeans.fit_predict(df)
  
    visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks_reframed', None),
      use_normalized_coordinates=True,
      line_thickness=8,
      team = y_kmeans)
    '''
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    ax.scatter(r_points, b_points, g_points, c=y_kmeans)
    plt.show()
    '''
  return image_np

Another interesting part is how we apply k-means. Given that images in numpy are represented with a tridimensional vector (red, green ,blue) we average each layer and get 3 numbers per people object. We feed those 3 dimensions into the k-means and get the clusters.

You can also display the k-means visualization by uncommenting these lines:

    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    ax.scatter(r_points, b_points, g_points, c=y_kmeans)
    plt.show()

I also added a code snippet that you can use to read a video and generate another video with the detected teams:

from google.colab.patches import cv2_imshow
import cv2
FILE_OUTPUT = "test.avi"

PATH_TO_TEST_IMAGES_DIR = pathlib.Path('models/research/object_detection/test_images/soccer.avi')

vcap = cv2.VideoCapture('models/research/object_detection/test_images/soccer.avi')
frame_width = int(vcap.get(3))
frame_height = int(vcap.get(4))

out = cv2.VideoWriter(FILE_OUTPUT, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'),
                     24, (frame_width, frame_height))
ret, frame = vcap.read()


i = 0
while(i<1):
    ret, frame = vcap.read()
    im = detect_team(detection_model, frame,df)
    #cv2_imshow(im)
    out.write(im)
    i = i+1

vcap.release()
out.release()

Take a look at the video:

You can find the code on this repository.

Standard

Object Detection. A shortcut when thinking about labeling images

Detecting objects on an image can be accomplished by using a deep learning model. There a lot of pre-trained models on the Internet that you can use. Sometimes you might want to train your model to detect a specific object (sharks, squirrels, a mask on a person’s face…). There are multiple tutorials about how to train these models on custom datasets. Don’t.

Before even thinking about creating your own dataset. Downloading 100s of images and labeling them using labelimg can take a lot of time. And in some cases, it might be unnecessary. Test some out of the box models before going into the long route.  This step won’t take long and can save you a ton of time.

I am going to use a couple of examples. I want to detect certain animals in pictures. Sharks and squirrels. Let’s say we have a research purpose to do this. Before using the long route let’s try the other approach.

We are going to use the official Google Collab notebook to test the different models. The notebook is pretty straight forward if you run all the cells you are going to use a common model trained on a dataset called Coco. The full name of the model: ssd_mobilenet_v1_coco_2017_11_17.  This is a fast model but looks like it won’t work for our purposes. I uploaded an image to /content/models/research/object_detection/test_images and this is the result we got:

squirrel mscoco 3.png

Not the results we were expecting. It has a low confidence and not a very good prediction. If we assume that’s a squirrel.

Before changing some stuff on the code, you can find some pre-trained ready to use models on the Tensorflow detection model zoo. There are models trained on different datasets and with different performances.

Let’s change a couple of lines of code and test again. First we are going to use a model from the Inaturalist dataset:

  • On the section “Loading Label Map” we are going to use the following code: PATH_TO_LABELS = ‘models/research/object_detection/data/fgvc_2854_classes_label_map.pbtxt’  The label map helps us interpret the output of the new model. It ties the category number to a name.
  • On the section “Detection” we are going to use the following code: model_name = ‘faster_rcnn_resnet101_fgvc_2018_07_19’ This will tell the code which model to download.

With the new model this are the results:

download.png

Higher confidence and a weird latin name (scriurus carolinensis). If we use wikipedia we find that the other known name is Eastern gray squirrel. Not bad. The same if we test a shark image:

whale shark naturalist.png

Using Wikipedia we can find that it’s a whale shark.

Just for the same of experimenting I used a model trained on the OpenImage dataset. I used the following label map and model:

  • model_name = ‘faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28’
  • PATH_TO_LABELS=
    ‘models/research/object_detection/data/oid_bbox_trainable_label_map.pbtxt’
shark open image.png

Here we got a higher confidence but not as accurate detection. Depending on your use case one model or the other could be better.

In some cases you’ll still need a custom dataset and going the long route. But checking this avenue won’t hurt and might save you a lot of precious time.

Standard

How to survive multiple interview processes

As Paul Krugman says: These are strange times. We are living under unusual circumstances and there are a ton of negative stuff to think about. A lot of people have been laid off of their jobs and are looking for a new gig.

For software developers we have an interesting market, some companies are reducing the headcount and some others are hiring. For junior or mid software developers there is an additional challenge they might fit in different positions based on the programming languages they know or like. Which one should I study? Should I polish the ones I already know? Should I learn a new one? As always the answer is: it depends. I’ll elaborate on the following paragraphs.

I’ve seen a lot of similarities between finding a new job and selling enterprise software (please don’t run yet, bare with me). In sales we have a pipeline were we manage opportunities and try to maximize our sales numbers. Here you only need 1 win, only need to be right once. I’ll explain in simple terms what’s a pipeline and how you can use it to survive this tough situation.

The pipeline has 5 stages. We have some interesting assumptions and facts:

  • Not all opportunities have to go through the 5 stages.
  • As we advance in the pipeline the probability of winning should increase.
  • Opportunities have different velocities to the pipeline, some can change really quickly and others can be very slow.

The 5 stages are:

  1. Identify: Here opportunities are like gossiping. You heard a company is hiring, you read a post on LinkedIn. At this point we know there might be an opportunity but you don’t even know if you have what it takes.
  2. Qualify: To be able to pass to this stage you must assess if the opportunity is a fit for you. Review the seniority level they are looking for, years of experience, programming languages and industry.
  3. Pursue: Once you send your application (CV) or you asked a friend to refer you, then we are on pursue. By this point, it might be a good idea to polish or learn new languages.
  4. Closing: Now we are in the interview phase. This can vary a lot from company to company: you can have several phone interviews, in the past, you might have an on-site and you could be interviewed by your future boss, to name a few.
  5. Won: You accepted the offer! Congrats! Hopefully, you’ll be on this stage pretty soon.

After that long explanation, I would say you should study new languages or polish the ones you know based on your pipeline. Probably doing an Elixir course based on an opportunity in the Identify stage could not be a great idea (it could if you really want to work on Elixir). You could get serious about a new language if you are on Pursue or Closing.

This advice might not apply to everyone and every situation. It’s my 5 cents to try to help in this difficult time we are living.

Good luck finding your next gig!

Prueba001-05.jpg

30 things I learned at my first job (Daniel Rojas)

 

Standard

How to create a practical Open AI custom environment for the rest of us (sourcing problem)

Why create a custom Open AI?

Today we are using classical Machine Learning and Neural Networks to solve all kinds of problems. Reinforcement Learning is being used to solve games and some industrial applications, but I think this will change pretty soon. Instead of tagging images, we will be creating business environments for our agents to learn and perform in real life. I’ve read some tutorials on how to create an environment but the framing is the difficult part.

The sourcing problem

I wanted to build an environment that had to do something with business. All the companies have sourcing departments. People who buy stuff, either to resell it (as e-commerce) or to operate the business (you need pens, cars, Dunder Mifflin paper…).

When you buy the stuff you have to make decisions. Imagine that you have 5 suppliers, each one has a different price and different reliability. There are very cheap suppliers but not very reliable and there is a premium that always delivers on time. You can find the example on the following table:

Screen Shot 2020-04-12 at 11.07.22 AM.png

In this case, paying a low price carries a high risk. Every supplier sells you 5 articles at a time and you only have $1000 to buy as many articles as you can. The best-case scenario would be that you are super lucky and buy always

Framing the sourcing problem as a Reinforcement Learning problem

Before start writing the code, there are some things we need to define. In my case this takes me more time than writing the actual code:

  • Action space: Which actions can your agent take? Here we will have 5 actions, buy from 1 of the 5 suppliers.
  • Observation space: This is what our agent will see. In our case, we will let him know how many articles he currently has, how close he is to the goal (50 articles).
  • Reward: You need to give a prize to your agent when he does good. Here we will use the following function: (current articles/max_articles:50)^0.9

Screen Shot 2020-04-12 at 11.12.22 AM.png

For me, the trickiest points here are the observation space and the reward function. In the observation space, I tried to read a lot of code from existing environments in Open Ai documentation. In this table, you can find environments and their types so you can look faster. Choosing the reward function is an art, I watched a great youtube video that gave me the idea of choosing this function.

Simple steps to create a custom Open AI Environment

Once you framed the problem in an RL way it starts going downhill.

  • You need to create the file structure for your environment. There is great documentation on Open AI. Be careful with the names and make sure you replace everything, this can get cumbersome fast.
  • After you’ve done this you need to modify some specific methods:
    • __init__: where you initialize all the variables.
    • step: whenever your agent takes action, it will call this method. The environment will return a state so the agent can take its next decision.
    • reset: Once you’ve reached a terminal state (you are out of money or bought all the articles) you need to reset everything that way the agent can try again.

Creating an agent

To test the environment I used q-learning. Explaining this is outside of the scope of this blogpost. I used the code from this great tutorial, I recommend watching his videos in case you want to learn more about Reinforcement learning. You can find the code here.

Results

I ran the agent with different training episodes and plotted how he performed.

With 1 training episode (almost random):

The lowest action is riskier but the cheapest. In this example, it managed to buy 30 articles.

Using 600 training episodes it managed to solve the problem. It was able to buy 50 articles and use an average approach.

Code

You can find the code on this repository.

I’ll be happy to hear about experiments, ideas or questions on twitter

Prueba001-05.jpg

30 things I learned at my first job (Daniel Rojas)

 

 

 

Standard

Easy and fast path to Video Object Detection (counting sharks)

Video Object Detection is a very interesting problem that could help a lot of people. I found out about it talking to a shark researcher (maybe not his exact title). They have grad students counting sharks in a video from an underwater camera. These videos can be very long and sometimes there are no sharks in hours. I thought about Machine Learning instantly, what could go wrong. I started reading about it and found different approaches.

  • Auto ML Solution (Google, MSFT…): I used these solutions in the past with images, with good results. The con is that these services do not provide video support, at least I was not able to find it.
  • Tensorflow:  I watched a ton of videos of examples of the Object Detection API. Be careful with the videos, search for recent ones the version changes can make very hard to follow the tutorial. I had some trouble trying to train the model with my own images. It might have been a combination of the documentation, my package management and maybe luck. I ended looking for another way.
  • Tensorflow Object Counting API: I found this repository. It has great examples and it’s built on top of Tensorflow. I still had some problems training my own images. My only comment would be that this API still lacks the abstraction I wanted to see on an API, at least for the training part.
  • Detecto: I found about this repository and the first thing I noticed was that it promised the abstraction I was looking for. I managed to train with my own images, all the different examples are ~5 lines of code. You don’t need to understand about Pytorch in order to use it. I was able to run it on a Google Collab, the free GPU’s made the training process faster. At some point going to the Tensorflow could make sense, but to start I recommend Detecto.

I need to feed more images to the model but here are is an example of the results:

shark_result.jpeg

Prueba001-05.jpg

30 things I learned at my first job (Daniel Rojas)

Standard

Your diverse background can be your main advantage. Supercharge it this quarantine.

I’ve seen in twitter a lot of people learning and coding on this quarantine, which is great. Learning and doing hard stuff can keep your mind healthy and you won’t get bored. There is something I think we should not miss while spending time at home learning.

Imagine that in the whole industry everybody came from the same background. Everybody majored in Computer Science, watched the same movies, had the same interests and liked to work in the same problems. We were not very far from this some years ago. Now we are in a very different place as an industry, we have a lot of folks coming from different places (education, work experiences, interests, culture..). Which is what need to solve the most challenging problems. We are good, but we can be better.

I’ve seen a lot of people studying frontend, backend, ML, Blockchain and similar things. Which is good and you need to do it, although don’t forget the other stuff. By that I mean, the stuff that makes you unique. As I mentioned at some point, hiring decisions can be very tight and sometimes you want to hire 2 people but just can’t. It’s the small stuff that makes the difference (happens very similar when thinking about promotions and raises). The good news is that small stuff it’s hard to copy because it has to do with you.

Here you’ll find a list about “other stuff” and how you can supercharge it this quarantine:

  • Your past work experience: This is a great tool from people transitioning into tech. These experiences can be a great advantage if they align with the role you are applying to. For example, you were a tour guide before jumping into Computer Science and you are applying now for a Software Developer role in Tripadvisor. And in order yo supercharge it, learn more about the industries you’ve worked, take a small MOOC, read some news about it or read a book on that field. We tend to underestimate our knowledge about a certain industry, trust me after a couple of years working on a specific industry you’ll have a ton of knowledge, use it!

 

  • Your interests: Humans tend to be curious about a group of fields and these can be uncorrelated. You might want to learn about anthropology, astronomy, medicine, sports, food or almost anything. Take a small break from coding and learn about a field you are interested in. This will have two effects: you might find relationships between the field and coding, and this could help you in a future interview. Either by the industry, you are applying or to build rapport with the interviewer. Just taking a break from coding can bring great results.

 

  • An action related to coding: You’ve been coding for the past 2 days non-stop that’s good. But also have in mind other related tasks that are not just coding. Try writing a Blogspot, create a video-tutorial or write a small book. These are great superpowers that can make you a better developer.

The main goal of this quarantine is to stay safe. It’s great that a lot of people are learning new stuff and hopefully a lot will be able to land new gigs soon. Keep coding (and doing other stuff)!

Prueba001-05.jpg

30 things I learned at my first job (Daniel Rojas)

Standard

4 non-tech interview questions that can make you stand out and land the job

The interview process is an art, probably a broken one. I know it has a lot of different factors and some maybe even random. But let’s focus on what we can control. We can control what you prepare and how well you perform. For the technical side (algorithms, inverting a binary tree…) there are a ton of good resources you can look for. 4 questions tend to be overlooked. These questions are not enough to turn around a bad technical interview, but you can check all the boxes if you ace them (a lot of these processes finish with very tight decisions).

  1. What have you heard about this company? This is a great opportunity for you to show how much you have prepared and how bad you want this job. You don’t want to show like this is just another interview in your whole process. Or worse that this is practice for the interviews you really care about. Here are some ways you can prepare:
    • Start by understanding the sector. You need to have which sector are they and how do they make money. Google can help you or talk to a friend who knows about the company/sector. Understand what the sector is going through, are they growing, shrinking. Which are the biggest competitors/menaces?
    • Understand the company from 10,000 ft. Are they public or private? How much revenue did they make last year? Any interesting acquisitions lately? Something important on the news? If the company is public, try searching for the last earning call transcript. This will help a lot and you’ll find opportunities to use your preparation.
    • Understand the company on a local scale. How is this branch doing? When did they open? Are they the new kids on the block? What is their relationship with the mothership?
  2. Tell me about the hardest problem you have solved. This could be considered a technical question but I want you to focus on the storytelling. Set a good context about that last problem that you could not stop thinking about. Be very specific on the problem, the avenues you tried and how did you end up solving it. You need to project grit and that you never give up. Also, explain how would you approach a similar problem now that you have solved that one. Extra credit if the problem is related to the technologies they are working with.
  3. Tell me something interesting you read lately. The interviewer is trying to find out if you really like what you do. Do you read about stuff in your free time, how do you keep up with all the stuff that is changing. I don’t recommend that you read a super complex article so you can show off. Just try to remember what you have read lately, explain why it was interesting and what you learned. This does not need to be 100% related to the technologies they work on, it could be about another field that is interesting for you (cybersecurity, networking, anything related to tech).
  4. Do you have any questions for me? A lot of people miss this opportunity to show they have prepared and to gain valuable information that could help in the following processes. Ask genuine questions about their job, work-life on that company, how are they’ve been affected by global events (COVID19 for example) and anything that you might want to know. Try to come prepared with some questions but have your eyes open for any question that might come up during the conversation.

An interview should be a conversation. It could happen that these questions are not included. Although you can use different parts of the interview to show what you have prepared. The easier part is at the end when they let you ask questions. Use that free form time to show how you have prepared and why you are the right person for the job.

Prueba001-05.jpg

30 things I learned at my first job (Daniel Rojas)

Standard

Which programming language should you learn on this quarantine?

I recently had a conversation with a group of college students about if they should learn R or Python. This might be a ML/AI kind of question although the possible answers apply to different areas of Computer Science. How do you choose between studying React or Angular? Rails or Laravel?

The first idea is that programming languages are like instruments. We use instruments to play music, but you can interpret the same song with different instruments. It’s really hard to choose between programming languages, every language has its pros and cons. If you understand well the fundamental principles (music) it does not matter which instrument you choose.

The second idea is borrowed. Inspired from a great tweet from Edouard Harris. If you are starting to code learn a language that interests you or that is related to one of your interests (build videogames for example). Once you are getting more familiar you can learn a language with bigger business applications.

The third idea would be what is your purpose for learning a language during the quarantine. Do you just want to have fun? Do you want to land a new gig? Pure intellectual curiosity? Answering this question might be a big percentage of the answer you are looking for.

It does not matter what you learn this quarantine, stay safe and happy coding.

Prueba001-05.jpg

30 things I learned at my first job (Daniel Rojas)

Standard

What every engineer should know about business cases

A lot of people think engineering and business should not be in the same sentence. I think they should be together more often. A lot of the problems I’ve seen in companies is engineers and business not talking to each other. That miscommunication can bring a lot of trouble to an organization. From a more positive side if you are starting as an engineer, understanding the business and communicating in their language can be a great superpower.

The purpose of this text is not to go through the different reasons the disconnection happens, but to propose a bridge. I am not going to suggest we send our all business friends to CS school (although it could help also) but the bridge in this scenario is the other way around.

Think about any interaction you have with somebody from the business side (Product Owner or something like that). A lot of times the conversation is engineers think they should build X but the business does not understand why. This causes two fundamental problems: engineers can’t build the right stuff, so everything crumbles (hello technical debt!) and business does not understand why engineers want to build X (looks like they just want to scratch their own itch).

Next time you are planning this kind of conversation. Frame it in a business case. This is a very simple exercise, where the central idea is to have a return of investment. We are going to build X, it is going to cost the company Y and we are going to make Z money, so in T time we are going to recover the money, after that we’ll profit. There are far more complex and complete approaches than this, but think it in a simple way. If you buy a car and use it for Uber, how long will you recover the money you invested in the car.

I know it’s pretty simple, but I want you to apply it to engineering. There is two kinds of costs: development (your time and your team) and infrastructure (you are probably working on cloud, so you’ll need a place to host all those shiny microservices of python scripts). Getting the cost should not be hard and if it starts to get very complex a good approximation will work.

The other part of the equation (how much money is the company going to make or Z in our equation) can be a little be more complex but still doable. There are several ways the company you work for can make money with software:

  • Selling more: You’ll be able to handle more traffic, products, regions or regions just to name a few
  • Being more efficient: I’ll take less time to close a deal, so if you are faster you’ll close more deals. Or you reduce costs, now everything is more automated and you don’t need a lot of humans ( I am not saying you should suggest replacing other coworkers with bash scripts, not cool, hopefully, they’ll be able to perform more meaningful tasks than what they were doing before)
  • Reducing risks: If we use MQRabbit there will be a lower chance of everything crumbling on Black Friday. So you won’t wake me up in the middle of the night saying that the site is down and people can’t buy cat food

I am sure there are a lot more ways, these were meant to give you examples and light up your inspiration. A cool trick is to involve the business people if you explain a little your idea they can think of other way of framing it.

Once you have costs and ways to make money, its just a matter of thinking how long it’ll take to recover they money you invested.

This is not a silver bullet not all your projects will be approved by this method. Although it will start some very interesting conversations that might help you build what you think is worth building.

Happy to hear from you on twitter.

Standard

A simple test to know when you need to follow another leader

First things first, I hope you never need this test. I hope you have a long and happy career and you always love working with your boss. Although by simple probability, this might not happen. There is a big chance that you end up working with somebody you don’t like or worse, you might complete opposites and have a different set of values.

Given that this happens at some point in your career, it brings a very interesting crossroads. This might not be only negative, it could be a chance to work on your versatility, communications skills or just your ability to push through difficult times. But there is a point where it would just be too much. A bad leader can damage your life, your career and you can end up not liking a job that you loved in the past.

How to know when you are learning and when you should look for a different direction (new job, new team, new life)? I found this simple test quite useful. Imagine you and your boss have a lot of success, you destroy all the company goals, you have a great year and you make a dent in the universe. After you imagined that, ask yourself the following question: Do you want to be with him in that position?

I told you the test was simple, interpreting the results might be more complicated. This simple question can help you think how different are you from your boss, if huge success still sounds like a bad idea, you might want to revisit the idea of asking to work with someone else. This could also help you understand if you and your boss are just going through a rough patch.

Standard