Comandos Mongo DB
Enviado por Carolina2127 • 19 de Enero de 2021 • Tarea • 472 Palabras (2 Páginas) • 127 Visitas
[pic 1][pic 2]Creando la BD
import json import requests import pymongo
import numpy as np import datetime import pandas as pd
client = pymongo.MongoClient("mongodb://localhost:27017/? readPreference=primary&appname=MongoDB%20Compass&ssl=false") db = client.activity2
collections_structure = [
{'collection': 'agriculture_horticulture_annual', 'filepath': './Agriculture horticulture information for Maori farms, annual.csv'},
{'collection': 'agriculture_land_use_annual', 'filepath': './Agriculture land-use information for Maori farms, annual.csv'},
{'collection': 'agriculture_livestock_info_annual', 'filepath': 'Agriculture livestock information for Maori farms, annual.csv'},
{'collection': 'business_demography_enterprises_authorities_annual', 'filepath': './Business demography enterprises for Maori authorities, annual.csv'},
{'collection': 'business_demography_enterprises_annual', 'filepath': './Business demography enterprises for Maori SMEs, annual.csv'},
{'collection': 'busisness_operation_rates_activities_annual', 'filepath': './Business operations rates, activities, annual.csv'},
{'collection': 'leed_estimates_filled_job_quarterly', 'filepath': './LEED estimates of filled jobs, quarterly.csv'},
{'collection': 'leed_worker_turnover_rates_quarterly', 'filepath': './LEED worker turnover rates, quarterly.csv'}
]
for elem in collections_structure:
df = pd.read_csv(elem['filepath'])
[pic 3][pic 4][pic 5]
Para la colección LEED worker turnover rates crear un nuevo campo...
[pic 6]
- Eliminar para todos los documentos de la coleccción...
[pic 7]
- [pic 8][pic 9]Para la colección "Agriculture horticulture information"...
No es necesario por la importación con pandas
- Para la colección "business operations, rates and activities" obtener...
[pic 10]
- Para la colección "business demography enterprises for Maori authorities " comprobar que el valor...
Sin agrupar por date
[pic 11]
[pic 12][pic 13][pic 14][pic 15]
El valor es incorrecto, en la suma de los archivos con Industry: "Total" hay 12 Enterprises menos
Agrupando por date
[pic 16][pic 17]db.getCollection('business_demography_enterprises_authorities_annual').agg regate([
{
$match: {Industry: {$ne: 'Total'}}
},
{
$group: {_id: '$Year', 'Enterprises Calculated Sum': {$sum:
'$Enterprises'}}
},
{
$sort: {_id: 1}
}
])
db.getCollection('business_demography_enterprises_authorities_annual').fin d({Industry: 'Total'}, {Year: 1, Enterprises: 1}).sort({year: 1})
[pic 18][pic 19][pic 20]
[pic 21][pic 22][pic 23]
Agrupando por fecha se observa que los años que difieren del Total son 2010, 2013, 2015 y 2017
- Convertir dos colecciones en una...
[pic 24]
[pic 25][pic 26][pic 27]
- Para la nueva colección calcular el total de empleados...
Totales
[pic 28]
[pic 29][pic 30]Por año
db.getCollection('business_demography_enterprises_merged').aggregate([
{$match: {Industry: 'Total'}},
{$group: {_id: '$Year', Employees: {$sum: '$EmployeeCount'}}}
])
/* /* { | RESULTS */ 1 */ "_id" : 2012, "Employees" : | 14600.0 |
} | ||
/* { | 2 */ "_id" : 2014, | |
} | "Employees" : | 16100.0 |
/* { | 3 */ "_id" : 2016, | |
} | "Employees" : | 19700.0 |
/* { | 4 */ "_id" : 2018, | |
} | "Employees" : | 20500.0 |
/* | 5 */ |
[pic 31][pic 32]{
"_id" : 2011,
"Employees" : 13800.0
}
/* 6 */
{
"_id" : 2013,
"Employees" : 14700.0
...