Saturday, November 1, 2014

Yet Another Thing Your Applicant Tracking System Can't Do


Photo credit and license: Pierre-Olivier Carles; CC BY 2.0

Like it or not, e-mail marketing - also known as "bulk e-mailing", but more commonly referred to as "spam" - is a fact of life. Most of the time it's annoying, sometimes - creepy, occasionally - hilarious... The bulk-mailed message I received a few days ago from a major nationwide IT recruiting-and-staffing agency that shall remain unnamed falls under the category of... well... kind of thought-provoking. At least that's the effect it had on me. The minimally redacted (to protect the identity of the company and its client) text of the e-mail is below.

    **Evening Shift Testing Opportunity! 5 positions!​!**

    Great opportunity in [area of about 450 sq. miles at least 50 miles away; name redacted] for computer testers! Hours will be 5PM to 12:00AM. Two month contract and an hourly rate of 18/hr also there will be a completition bonus upon successful work of $500 dollars!

    We are selecting candidates for this position - NO INTERVIEW with Client. No testing experience needed - must be comfortable working with computers, following instructions, and documenting work.

    Project Overview: 
    • Manual testing of access of [software name redacted] system on every machine that was installed
    • Test and document that application is working properly.
    • New IT hardware (wires, jacks, screens and lap tops) have been installed, replacing PCs and existing lap tops. 
    If you are interested, please let me know a time that you are available for a short interview/conversation either today or tomorrow.


Even though I do not see any reason for the sender to be so excited (too many exclamation points for my liking), I must admit that - in comparison with the absolutely meaningless junk I regularly receive - this is a reasonably informative e-mail. The problem here is not with the content. It is with the targeting.

It is obvious that they are looking for candidates who don't have "testing experience" and are just "comfortable working with computers, following instructions, and documenting work" (otherwise they would have to bump up the pay rate quite a bit, and that is precisely what they don't want to do). I am 100% sure, however, that all the e-mails were in fact sent out to people who do have software testing experience (me being one of them). Why? Because there is no easy way to "explain" to an applicant tracking system that you want a list of people who are reasonably intelligent, not computer-averse, yet have no real technical experience that would make them too expensive. To a human, that may sound like a very simple concept to grasp, but to a machine - not really.

Contrary to what sales people of all ATS vendors want their prospective clients to believe, ATS's are fairly primitive database applications (they can't think). If the resume parser of an ATS accurately extracts all the skills from the resumes of job applicants (and - based on my experience - that is a very big if), here is more or less what the extracted data might look like in the ATS database (these examples are made-up and intentionally oversimplified, of course):

Table 1: candidate
candidate_id first_name last_name
1 John Doer
2 Forrest Grumps
3 Robert Plaint
4 Ivan Andropopoff
5 Kate Bushido
    In real life, this table would have more columns (street address, ZIP code, state, e-mail, phone, etc.), but we'll keep it simple.

Table 2: skill
skill_id candidate_id skill_name duration
1 1 database administration 10
2 1 SQL 10
3 2 data analysis 2.5
4 2 business analysis 2.5
5 2 SAP 2.5
6 3 business systems analysis 3.25
7 3 software testing 4.5
8 3 technical writing 5
9 4 web development 4
10 4 C# 4
11 4 customer support 1
12 5 clerical 1.25
13 5 MS Office 1.25
    The skill table is related to the candidate table through the candidate_id columns (color-coded for your convenience). The skill table contains the skills and years of experience in each of them extracted by the resume parser from the resumes ever submitted to or imported (from other sources) into the ATS. Relevant keywords are simply extracted from job titles and descriptions of job duties while duration is calculated based on the start and end dates of each job. Some ATS's don't even extract anything, in which case keyword search is performed on the work_history table (see below).

Table 3: work_history
job_id candidate_id company_name job_title job_duties start_date end_date
1 1 XYZ Data, Inc. DBA Blah database administration blah blah. Blah Microsoft SQL Server blah blah. 2004-11-01 2014-11-01
2 2 ABCD, Inc. Data Analyst Blah blah data analysis blah SAP blah. Business analysis blah blah. 2011-04-01 2013-10-31
3 3 BaloneySoft, L.L.C. BSA Blah business systems analysis blah. 2010-01-01 2013-04-01
4 3 Bupkis Solutions, Inc. Software Tester Blah blah software testing. 2005-06-01 2009-12-31
5 3 Gross Tech Services, L.L.C. Technical Writer Blah technical writing blah. 2000-05-01 2005-05-31
6 4 WebPhaktory, L.L.C. Programmer Blah web development blah C#. 2010-07-01 2014-07-01
7 4 Tundra Hosting, Ltd. Customer Support Rep Blah customer support blah. 2009-07-01 2010-06-31
8 5 SmallData, Inc. Administrative Assistant Blah clerical duties. Blah MS Office blah. 2013-01-01 2014-04-10


It probably took you less than 30 seconds to figure out who of the five fictitious characters above would be the right candidate for the job (bearing in mind the pay rate constraint).

Unfortunately, in real life you would have to deal not with five rows in the candidate table, but more like fifty thousand or more, and the number of rows representing skills and jobs per applicant would be higher as well. So just looking at the data and figuring things out intuitively would not work.

Apparently, this recruiter - without too much thinking - simply searched the database for testers. In our made-up example database, this search would return one result - candidate #3. In real life, however, this search might return thousands of results, all of them with at least some experience in testing (and, possibly, some other areas), which means that hardly any of them will find the pay rate of $18/hour attractive.
    By the way, this is exactly how all kinds of "job agents" work. They just do keyword matching. If you have ever used them, you should know that they are pretty much useless (in addition to being annoying).

Searching for those who do NOT have testing experience would return every person in the database (except the "testers", of course), which in real life might be tens and tens of thousands of records. Some agencies seem to have absolutely no reservations about sending out massive amounts of spam to every single address they can lay their hands on, but more respectable ones try not to piss off the "high-end talent pool". So just plain exclusion of those with testing experience is not a viable option either.

Instead of filtering by keywords (which, as shown above, is unlikely to produce the desired result), I would suggest first trying to retrieve potential candidates that are likely to be interested in entry-level jobs by selecting all candidates with combined work experience less than a certain value, say, under 2 years (using the work_history table). Then you can narrow it down by selecting those who have attended college, but whose degree is not higher than bachelor's (in any ATS, there is a table that contains education data - I just did not include it above). If the result set is still too large, apply some generic keyword filters, like "computer", "Windows", etc. Alternatively (or in addition), you might try including/excluding certain college majors. This is not exact science, but such "exploratory" approach should produce better results than what the recruiter who sent me the e-mail achieved.

But... there is a BIG BUT, actually - a few BIG BUTs:
  • To the best of my knowledge, no ATS will let you perform all those manipulations through the GUI (graphical user interface). It is very difficult to make GUI as powerful and flexible as SQL (Structured Query Language) and keep it reasonably simple at the same time.
  • Unless it's an in-house application, no ATS vendor will ever let you access the data in the database (your data, by the way) directly (even in read-only mode).
  • Even if they were granted access to the data in their ATS database, most recruiters/sourcers would not be able to do much using SQL. In this industry, using Boolean search operators is considered "dark art" :-P
Besides, recruiting/staffing agencies seem to have this deeply rooted belief that it is all just a numbers game, and - since bulk e-mailing costs practically nothing, services of overseas BPO (business process outsourcing) centers cost next to nothing, and recent English Literature or Political Science grads working as entry-level recruiters/sourcers/screeners are not expensive either - they don't even consider spending money on any data analysis and going beyond the primitive and largely inefficient keyword matching technique.

No comments: