我将学到什么？

• 处理现实当中不完美的数据集
• 使用测试数据验证机器学习的结果
• 使用定量指标评估机器学习的结果
• 创建、选择和转换特征
• 比较机器学习算法的性能
• 为获得最大性能调整机器学习算法
• 清楚表述你的机器学习算法

需要的资源

git clone https://github.com/udacity/ud120-projects.git

poi_id.py：用于 POI 识别符的初始代码，你将在此处撰写你的分析报告。你也将提交此文件的副本，用于评估人员检验你的算法和结果。

final_project_dataset.pkl：项目数据集，详情如下。

tester.py：在你提交供优达学城评估的分析报告时，你将随附算法、数据集和你使用的特征列表（这些是在 poi_id.py 中自动创建的）。 评估人员将在此后使用这一代码来测试你的结果，以确保性能与你在报告中所述类似。你无需处理这一代码，我们只是将它呈现出来供你参考。

迈向成功

POI 标签: [‘poi’] (boolean，整数)

Project Evaluation

Final Project Evaulation Instructions

When you’re finished, your project will have 2 parts: the code/classifier you create and some written documentation of your work. Share your project with others and self-evaluate your project according to the rubric here.

Before you start working on the project: Review the final project rubric carefully. Think about the following questions - How will you incorporate each of the rubric criterion into your project? Why are these aspects important? What is your strategy to ensure that your project “meets specifications” in the given criteria? Once you are convinced that you understand each part of the rubric, please start working on your project. Remember to refer to the rubric often to ensure that you are on the right track.

Items to include when sharing your work with others for feedback:

Code/Classifier

When making your classifier, you will create three pickle files (my_dataset.pkl, my_classifier.pkl, my_feature_list.pkl). The project evaluator will test these using the tester.py script. You are encouraged to use this script before checking to gauge if your performance is good enough. You should also include your modified poi_id.py file in case of any issues with running your code or to verify what is reported in your question responses (see next paragraph).