Chapter 3 Missing values
First, we load the data and show the first couple lines here.
## name street_address
## 1 Smoke Jazz and Supper Club 2751 Broadway New York, NY 10025
## 2 Tavern on the Green 1 Tavern on the Green New York, NY 10023
## 3 ABC Kitchen 35 East 18th Street New York, NY 10003
## google_map
## 1 //www.google.com/maps/search/?api=1&query=40.8012200%2C-73.968013
## 2 //www.google.com/maps/search/?api=1&query=40.7723930%2C-73.97862
## 3 //www.google.com/maps/search/?api=1&query=40.7377830%2C-73.989714
## review_count phone website
## 1 2155 (212) 864-6662 http://www.smokejazz.com/
## 2 7029 (212) 877-8684 http://www.tavernonthegreen.com/
## 3 6031 (212) 475-5829 http://www.abckitchennyc.com/
## restaurant_type average_review food_review service_review
## 1 Contemporary American 4.42155 3.9 4.2
## 2 American 4.67029 4.4 4.4
## 3 Contemporary American 4.76031 4.6 4.4
## ambience_review value_review price_range star_1 star_2 star_3 star_4 star_5
## 1 4.5 4.0 $31 to $50 2 7 5 24 62
## 2 4.7 4.0 $31 to $50 1 2 6 19 72
## 3 4.6 4.1 $31 to $50 1 2 5 15 77
## description
## 1 Smoke has augmented its reputation as one of Manhattan’s most distinguished jazz venues with an addition very uncommon to jazz clubs—great food. Smoke serves innovative American Bistro fare developed by critically acclaimed executive chef Patricia Williams. Smoke is proud to be New York's only boutique Jazz & Supper Club with an award winning chef and world-class jazz seven nights a week.IMPORTANT RESERVATION INFORMATION:Please check the Smoke Calendar page at www.smokejazz.com to view music charge policies for the date you are planning to join us. If you are a Smoke Rewards Card holder, you are entitled to use your card Monday - Thursday, with the exception of some special events. We kindly recommend that our guests purchase tickets in advance on our website. You are always welcome to call us with questions or concerns regarding music charge policies at (212) 864-6662. Thank you!
## 2 Magical is a word thrown around a lot when discussing Tavern On The Green and one can’t help but feel magic in the air. Jim and David, architect Richard Lewis and landscape architect Robin Key, preserved the Victorian/ Gothic elegance of the semi-circular building; it is authentic, natural, elegant and sexy. It has been re-built to spectacular precision and the décor is of a grand farmhouse one might find on the property of an Italian Villa or a historic Hudson River Valley mansion.
## 3 ABC Kitchen with Jean-Georges: passionately committed to offering the freshest organic and local ingredients possible.ABC Kitchen presents a changing menu that is locally sourced and globally artistic in a fresh and articulate space.
## restaurant_main_type latitude longitude postal_code
## 1 Contemporary American 40.80113 -73.96819 10025
## 2 American 40.77219 -73.97772 10023
## 3 Contemporary American 40.73790 -73.98950 10003
Then, we check the number of NA’s in each column.
## name street_address google_map
## 0 0 0
## review_count phone website
## 0 0 2
## restaurant_type average_review food_review
## 0 0 0
## service_review ambience_review value_review
## 0 0 0
## price_range star_1 star_2
## 0 0 0
## star_3 star_4 star_5
## 0 0 0
## description restaurant_main_type latitude
## 0 0 0
## longitude postal_code
## 0 1
There are 3 NA’s in website column. To find out where they are, we plot the missing patterns in data by rows with mi package.

From the plot above, we are able to detect all the missing values are. Since it only consist of a small part of our data, we decided to remove these rows by using na.omit() function. And we check again to make sure there are no NA’s in the dataset.
## name street_address google_map
## 0 0 0
## review_count phone website
## 0 0 0
## restaurant_type average_review food_review
## 0 0 0
## service_review ambience_review value_review
## 0 0 0
## price_range star_1 star_2
## 0 0 0
## star_3 star_4 star_5
## 0 0 0
## description restaurant_main_type latitude
## 0 0 0
## longitude postal_code
## 0 0
We then check for NA’s in our second dataset.
## DBA
## 1 GOLDBAR
## 2 GOLDBAR
## 3 GOLDBAR
## 4 EAT-A-BAGEL (JOHN A NOBLE FERRY BOAT)
## 5 RESTAURANT ON 58 STREET
## 6 RESTAURANT ON 58 STREET
## VIOLATION.DESCRIPTION
## 1 Plumbing not properly installed or maintained; anti-siphonage or backflow prevention device not provided where required; equipment or floor not properly drained; sewage disposal system in disrepair or not functioning properly.
## 2 Food contact surface not properly washed, rinsed and sanitized after each use and following any activity when contamination may have occurred.
## 3 Personal cleanliness inadequate. Outer garment soiled with possible contaminant. Effective hair restraint not worn in an area where food is prepared.
## 4
## 5 Cold food item held above 41o F (smoked fish and reduced oxygen packaged foods above 38 oF) except during necessary preparation.
## 6 Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.
## CRITICAL.FLAG SCORE GRADE GRADE.DATE
## 1 N 13 A 05/31/2013
## 2 Y 13 A 05/31/2013
## 3 Y 13 A 05/31/2013
## 4 0 A 06/07/2013
## 5 Y 9 A 02/11/2014
## 6 N 9 A 02/11/2014
Here are the missing values.
## DBA VIOLATION.DESCRIPTION CRITICAL.FLAG
## 0 1755 1755
## SCORE GRADE GRADE.DATE
## 0 0 0
Since our dataset is large enough, we can delete these rows. Then double check.
## DBA VIOLATION.DESCRIPTION CRITICAL.FLAG
## 0 0 0
## SCORE GRADE GRADE.DATE
## 0 0 0