Post

🐘 PostgreSQL κΈ°λ³Έ 12 - λ‚ μ§œμ™€ μ‹œκ°„μ„ μ‚¬μš©ν•œ μž‘μ—…

🐘 PostgreSQL κΈ°λ³Έ 12 - λ‚ μ§œμ™€ μ‹œκ°„μ„ μ‚¬μš©ν•œ μž‘μ—…

πŸ“™ γ€Žμ‹€μš© SQL』을 읽고 μ •λ¦¬ν•œ κΈ€μž…λ‹ˆλ‹€.

λ‚ μ§œ 및 μ‹œκ°„μ— λŒ€ν•œ 데이터 νƒ€μž…κ³Ό ν•¨μˆ˜ μ΄ν•΄ν•˜κΈ°

  • λ‹€μŒμ€ λ‚ μ§œ 및 μ‹œκ°„ κ΄€λ ¨ 데이터 νƒ€μž…μ΄λ‹€.
  1. timestamp
    • λ‚ μ§œμ™€ μ‹œκ°„μ„ κΈ°λ‘ν•œλ‹€.
    • ν‘œμ€€ μ‹œκ°„λŒ€λ₯Ό 포함해 μ‹œκ°„μ„ μ €μž₯ν•˜κ³  μ‹Άλ‹€λ©΄ with time zone ν‚€μ›Œλ“œλ₯Ό μΆ”κ°€ν•΄μ•Ό ν•œλ‹€.
    • timestamp with time zone ν˜•μ‹μ€ SQL ν‘œμ€€μ˜ 일뢀이며, PostgreSQLμ—λŠ” 그와 λ™μΌν•œ 데이터 νƒ€μž…μΈ timestamptzκ°€ μžˆλ‹€.
    • ν‘œμ€€ μ‹œκ°„λŒ€λŠ” UTC μ˜€ν”„μ…‹, μ˜μ—­/μœ„μΉ˜ μ§€μ •μž, λ˜λŠ” ν‘œμ€€ μ•½μ–΄λΌλŠ” μ„Έ κ°€μ§€ ν˜•μ‹μœΌλ‘œ μ§€μ •ν•  수 μžˆλ‹€.
    • μ‹œκ°„λŒ€κ°€ μ—†λŠ” μ‹œκ°„μ„ timestamptz열에 μ œκ³΅ν•  경우 λ°μ΄ν„°λ² μ΄μŠ€λŠ” μ„œλ²„μ˜ κΈ°λ³Έ 섀정을 μ‚¬μš©ν•˜μ—¬ μ‹œκ°„λŒ€ 정보λ₯Ό μΆ”κ°€ν•œλ‹€.
  2. date
    • λ‚ μ§œλ§Œ κΈ°λ‘ν•˜λ©°, SQL ν‘œμ€€μ˜ 일뢀닀.
    • with time zone ν‚€μ›Œλ“œλ₯Ό μ‚¬μš©ν•˜λ©΄ μ—¬λŸ¬ λ‚ μ§œ ν˜•μ‹μ„ μ‚¬μš©ν•  수 μžˆλ‹€.
    • λ‚ μ§œλ₯Ό ν‘œμ‹œν•˜λ €λ©΄ ISO 8601 ꡭ제 ν‘œμ€€ ν˜•μ‹μ΄μž PostgreSQL κΈ°λ³Έ 좜λ ₯인 YYYY-MM-DD 포맷 μ‚¬μš©μ΄ ꢌμž₯λœλ‹€.
    • ISO ν˜•μ‹μ„ μ‚¬μš©ν•˜λ©΄ 데이터λ₯Ό ꡭ제적으둜 κ³΅μœ ν•  λ•Œ ν˜Όμ„ μ„ λ°©μ§€ν•  수 μžˆλ‹€.
  3. time
    • μ‹œκ°„λ§Œ κΈ°λ‘ν•˜λ©°, SQL ν‘œμ€€μ˜ 일뢀닀.
    • with time zone ν‚€μ›Œλ“œλ₯Ό μ‚¬μš©ν•˜λ©΄ μ—΄μ—μ„œ μ‹œκ°„λŒ€λ₯Ό μΈμ‹ν•˜μ§€λ§Œ, λ‚ μ§œκ°€ μ—†μœΌλ©΄ μ‹œκ°„λŒ€λŠ” μ˜λ―Έκ°€ μ—†λ‹€.
    • 이 점을 κ³ λ €ν•˜λ©΄ with time zone ν‚€μ›Œλ“œλ‚˜ timestapmtzλŠ” μ‚¬μš©ν•˜μ§€ μ•ŠλŠ” 것이 μ’‹λ‹€.
    • ISO 8601 ν˜•μ‹μ€ HH:MM:SS이고 μ‹œκ°„, λΆ„, 초λ₯Ό λ‚˜νƒ€λ‚Έλ‹€.
  4. interval
    • quantity unit ν˜•μ‹μœΌλ‘œ ν‘œν˜„λœ μ‹œκ°„ λ‹¨μœ„λ₯Ό λ‚˜νƒ€λ‚΄λŠ” 값을 λ³΄μœ ν•œλ‹€.
    • 12 daysλ‚˜ 8 hours와 같이 κΈ°κ°„λ§Œ κΈ°λ‘ν•œλ‹€.
    • SQL ν‘œμ€€μ˜ μΌλΆ€μ§€λ§Œ, PostgreSQL μ „μš© ꡬ문은 더 λ§Žμ€ μ˜΅μ…˜μ„ μ œκ³΅ν•œλ‹€.
  • μ„Έ 데이터 νƒ€μž… date, time, timestamp with time zone은 datetime νƒ€μž…μ΄λΌκ³  λΆ€λ₯΄λ©°, κ·Έ 값을 datetimes라고 ν•œλ‹€.
  • interval νƒ€μž…μ˜ 값은 intervals라고 ν•œλ‹€.

λ‚ μ§œμ™€ μ‹œκ°„ μ‘°μž‘ν•˜κΈ°

  • λ‚ μ§œ 및 μ‹œκ°„ 데이터 νƒ€μž…, ꡬ문, ν•¨μˆ˜ 등은 λ°μ΄ν„°λ² μ΄μŠ€μ— 따라 SQL ν‘œμ€€μ—μ„œ λ²—μ–΄λ‚˜λŠ” κ²½μš°κ°€ λ§ŽμœΌλ―€λ‘œ μ£Όμ˜ν•΄μ•Ό ν•œλ‹€.

νƒ€μž„μŠ€νƒ¬ν”„ κ°’μ˜ ꡬ성 μš”μ†Œ μΆ”μΆœν•˜κΈ°

1
2
3
4
5
6
7
8
9
10
11
SELECT
	date_part('year', '2022-12-01 18:37:12 EST'::timestamptz) AS year,
	date_part('month', '2022-12-01 18:37:12 EST'::timestamptz) AS month,
	date_part('day', '2022-12-01 18:37:12 EST'::timestamptz) AS day,
	date_part('hour', '2022-12-01 18:37:12 EST'::timestamptz) AS hour,
	date_part('minute', '2022-12-01 18:37:12 EST'::timestamptz) AS minute,
	date_part('seconds', '2022-12-01 18:37:12 EST'::timestamptz) AS seconds,
	date_part('timezone_hour', '2022-12-01 18:37:12 EST'::timestamptz) AS tz,
	date_part('week', '2022-12-01 18:37:12 EST'::timestamptz) AS week,
	date_part('quarter', '2022-12-01 18:37:12 EST'::timestamptz) AS quarter,
	date_part('epoch', '2022-12-01 18:37:12 EST'::timestamptz) AS epoch;
  • μœ„ λͺ…령문은 date_part()λ₯Ό μ‚¬μš©ν•˜μ—¬ timestamp κ°’μ˜ ꡬ성 μš”μ†Œλ₯Ό μΆ”μΆœν•˜λŠ” 쿼리닀.
  • ν•΄λ‹Ή ν•¨μˆ˜μ˜ 첫 번째 μΈμžμ—λŠ” λ‚ μ§œ λ˜λŠ” μ‹œκ°„ 뢀뢄을 λ‚˜νƒ€λ‚΄λŠ” ν…μŠ€νŠΈκ°€ λ“€μ–΄κ°€κ³ , 두 번째 μΈμžμ—λŠ” date, time, timestamp 값이 λ“€μ–΄κ°„λ‹€.
  • κ²°κ³ΌλŠ” λ‹€μŒκ³Ό κ°™λ‹€.
yearmonthdayhourminutesecondstzweekquarterepoch
2,0221228371294841,669,937,832
  • tz 열은 UTC(ν˜‘μ •μ„Έκ³„μ‹œ)λ‘œλΆ€ν„° μ‹œκ°„ 차이 λ˜λŠ” μ˜€ν”„μ…‹μ„ λ³΄κ³ ν•œλ‹€.
  • κ°€λ Ή μœ„ κ²°κ³ΌλŠ” UTC보닀 9μ‹œκ°„ 이λ₯Έ μ‹œκ°„λŒ€λ₯Ό μ§€μ •ν•œλ‹€.

μ‹œκ°„λŒ€μ—μ„œ UTC μ˜€ν”„μ…‹μ„ μœ λ„ν•  μˆ˜λŠ” μžˆμ§€λ§Œ κ·Έ λ°˜λŒ€μ˜ κ²½μš°λŠ” λΆˆκ°€λŠ₯ν•˜λ‹€. 각 UTC μ˜€ν”„μ…‹μ€ μ—¬λŸ¬ λͺ…λͺ…λœ μ‹œκ°„λŒ€μ™€ ν‘œμ€€ 및 일광 μ ˆμ•½ μ‹œκ°„ λ³€ν˜•μ„ μ°Έμ‘°ν•  수 μžˆλ‹€.

  • week 열은 2022λ…„ 12μ›” 2일이 ν•΄λ‹Ή μ—°λ„μ˜ 48번째 주에 해당함을 보여쀀닀.
  • quarter 열은 ν•΄λ‹Ή λ‚ μ§œκ°€ μ˜¬ν•΄ 4뢄기에 속함을 μ•Œ 수 μžˆλ‹€.
  • epoch 열은 컴퓨터 μ‹œμŠ€ν…œκ³Ό ν”„λ‘œκ·Έλž˜λ° μ–Έμ–΄μ—μ„œ μ‚¬μš©λ˜λŠ” μΈ‘μ • 값을 λ‚˜νƒ€λ‚΄λŠ”λ°, UTC 0인 1970λ…„ 1μ›” 1일 μ˜€μ „ 12μ‹œ 이전 λ˜λŠ” μ΄ν›„λ‘œ 경과된 μ‹œκ°„μ„ 초 λ‹¨μœ„λ‘œ 보여쀀닀.

μœ λ‹‰μŠ€ μ‹œκ°„μ— μ£Όμ˜ν•˜μž. PostgreSQL의 date_part()λŠ” double precision νƒ€μž…μœΌλ‘œ μœ λ‹‰μŠ€ μ‹œκ°„μ„ λ°˜ν™˜ν•œλ‹€. double precision νƒ€μž…μ€ 뢀동 μ†Œμˆ˜μ μ΄λΌ 계산 μ‹œ μ˜€μ°¨κ°€ λ°œμƒν•˜κΈ°λ„ ν•œλ‹€. λ˜ν•œ μœ λ‹‰μŠ€ μ‹œκ°„μ„ μ‚¬μš©ν•˜λ©΄ 2038λ…„ λ¬Έμ œλ„ 쑰심해야 ν•œλ‹€. 일뢀 컴퓨터 μ‹œμŠ€ν…œμ—μ„œ νŠΉμ • μ‹œκ°„μ΄ μ§€λ‚˜λ©΄ μ˜€λ²„ν”Œλ‘œκ°€ λ°œμƒν•˜λŠ” 였λ₯˜λ₯Ό 2038λ…„ 문제라고 λΆ€λ₯Έλ‹€.

  • PostgreSQL은 date_part() ν•¨μˆ˜μ™€ λ™μΌν•œ λ°©μ‹μœΌλ‘œ datetimesλ₯Ό ꡬ문 λΆ„μ„ν•˜λŠ” ν‘œμ€€ SQL extract() ν•¨μˆ˜λ„ μ§€μ›ν•˜μ§€λ§Œ 두 κ°€μ§€ 이유둜 date_part()κ°€ ꢌμž₯λœλ‹€.
  1. 이름 자체만으둜 역할을 μƒκΈ°μ‹œν‚¨λ‹€.
  2. extract()λŠ” μ—¬λŸ¬ λ°μ΄ν„°λ² μ΄μŠ€μ—μ„œ 널리 μ§€μ›λ˜μ§€ μ•ŠλŠ”λ‹€.
  • κ·ΈλŸΌμ—λ„ extract()λ₯Ό μ‚¬μš©ν•΄μ•Ό ν•œλ‹€λ©΄ extract(text from value) ν˜•μ‹μ„ μ·¨ν•œλ‹€.
  • νƒ€μž„μŠ€νƒ¬ν”„μ—μ„œ 연도λ₯Ό κ°€μ Έμ˜€κΈ° μœ„ν•΄μ„œλŠ” extract('year' from '2022-12-01 18:37:12 EST'::timestamptz) 와 같이 μΏΌλ¦¬ν•œλ‹€.

νƒ€μž„μŠ€νƒ¬ν”„ ꡬ성 μš”μ†Œμ—μ„œ λ‚ μ§œ μ‹œκ°„ κ°’ λ§Œλ“€κΈ°

  • 연도, μ›”, 일이 λ³„λ„μ˜ 열에 μ‘΄μž¬ν•˜λŠ” 데이터셋을 λ°œκ²¬ν•˜λŠ” 것은 λ“œλ¬Έ 일이 μ•„λ‹ˆλ‹€.
  • μ΄λŸ¬ν•œ ꡬ성 μš”μ†Œμ—μ„œ PostgreSQL ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•˜μ—¬ datetime 값을 생성할 수 μžˆλ‹€.
1
2
3
4
5
6
7
8
-- λ‚ μ§œ λ§Œλ“€κΈ°
SELECT make_date(2022, 2, 22);

-- μ‹œκ°„ λ§Œλ“€κΈ°
SELECT make_time(18, 4, 30.3);

-- μ‹œκ°„λŒ€κ°€ 적용된 timestamp λ§Œλ“€κΈ°
SELECT make_timestamptz(2022, 2, 22, 18, 4, 30.3, 'Europe/Lisbon');
  • 이 μ„Έ ν•¨μˆ˜μ—μ„œλŠ” integer νƒ€μž…μ˜ λ³€μˆ˜λ₯Ό μž…λ ₯으둜 μ‚¬μš©ν•˜μ§€λ§Œ, λ‹€μŒ 두 κ°€μ§€ μ˜ˆμ™Έκ°€ μžˆλ‹€.
  1. μ΄ˆλŠ” μ†Œμˆ˜λ‘œ 된 초 λ‹¨μœ„λ₯Ό μ œκ³΅ν•  수 있기 λ•Œλ¬Έμ—, double precision νƒ€μž…μ˜ 숫자둜 μ§€μ •ν•΄μ•Ό ν•œλ‹€.
  2. μ‹œκ°„λŒ€λŠ” text νƒ€μž…μ˜ λ¬Έμžμ—΄λ‘œ μ§€μ •ν•΄μ•Ό ν•œλ‹€.

ν˜„μž¬ λ‚ μ§œ 및 μ‹œκ°„ κ²€μƒ‰ν•˜κΈ°

  • 행을 μ—…λ°μ΄νŠΈν•  λ•Œ 쿼리의 μΌλΆ€λ‘œ ν˜„μž¬ λ‚ μ§œ λ˜λŠ” μ‹œκ°„μ„ 기둝해야 ν•˜λŠ” κ²½μš°μ— ν‘œμ€€ SQL도 이에 λŒ€ν•œ ν•¨μˆ˜λ₯Ό μ œκ³΅ν•œλ‹€.
1
2
3
4
5
6
7
SELECT
	current_timestamp,
	localtimestamp,
	current_date,
	current_time,
	localtime,
	now();
  1. current_timestamp
    • μ‹œκ°„λŒ€λ₯Ό ν¬ν•¨ν•œ ν˜„μž¬ νƒ€μž„μŠ€νƒ¬ν”„λ₯Ό λ°˜ν™˜ν•œλ‹€.
    • PostgreSQL μ „μš© 단좕 버전은 now()이닀.
  2. localtimestamp
    • μ‹œκ°„λŒ€λ₯Ό ν¬ν•¨ν•˜μ§€ μ•Šμ€ ν˜„μž¬ νƒ€μž„μŠ€νƒ¬ν”„λ₯Ό λ°˜ν™˜ν•œλ‹€.
    • μ‹œκ°„λŒ€κ°€ μ—†λŠ” νƒ€μž„μŠ€νƒ¬ν”„λŠ” μ˜λ―Έκ°€ μ—†μœΌλ‹ˆ ꢌμž₯λ˜μ§€ μ•ŠλŠ”λ‹€.
  3. current_date: λ‚ μ§œλ₯Ό λ°˜ν™˜ν•œλ‹€.
  4. current_time: μ‹œκ°„λŒ€λ₯Ό ν¬ν•¨ν•œ ν˜„μž¬ μ‹œκ°„μ„ λ°˜ν™˜ν•œλ‹€.
  5. localtime: μ‹œκ°„λŒ€λ₯Ό ν¬ν•¨ν•˜μ§€ μ•Šμ€ ν˜„μž¬ μ‹œκ°„μ„ λ°˜ν™˜ν•œλ‹€.
  • μ΄λŸ¬ν•œ ν•¨μˆ˜λ“€μ€ 쿼리가 μ‹œμž‘ν•  λ•Œ μ‹œκ°„μ„ κΈ°λ‘ν•˜λ―€λ‘œ 쿼리 μ‹€ν–‰ μ‹œκ°„κ³ΌλŠ” 관계가 μ—†λ‹€.
  • λ§Œμ•½ 쿼리 μ‹€ν–‰ 쀑 μ‹œκ³„κ°€ λ³€κ²½λ˜λŠ” 방식을 λ‚ μ§œμ™€ μ‹œκ°„μ— λ°˜μ˜ν•˜κ³  μ‹Άλ‹€λ©΄ clock_timestamp() ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•˜μ—¬ μ‹œκ°„ 경과에 따라 기둝할 수 μžˆλ‹€.
  • ν•΄λ‹Ή ν•¨μˆ˜λŠ” λŒ€μš©λŸ‰ 쿼리λ₯Ό 느리게 λ§Œλ“€κ³  μ‹œμŠ€ν…œ μ œν•œμ΄ 적용될 수 μžˆμœΌλ―€λ‘œ μ£Όμ˜ν•΄μ•Ό ν•œλ‹€.
1
2
3
4
5
6
7
8
9
10
11
12
13
CREATE TABLE current_time_example (
	time_id integer GENERATED ALWAYS AS IDENTITY,
	current_timestamp_col timestamptz,
	clock_timestamp_col timestamptz
);

INSERT INTO current_time_example
		(current_timestamp_col, clock_timestamp_col)
	(SELECT current_timestamp,
		clock_timestamp()
	 FROM generate_series(1,1000));

SELECT * FROM current_time_example;
  • μœ„ INSERT λͺ…λ Ήλ¬Έμ˜ 첫 번째 열은 μ‹œμž‘ μ‹œκ°„μ„ κΈ°λ‘ν•˜λŠ” current_timestampλ₯Ό μ‚¬μš©ν–ˆλ‹€.
  • 두 번째 μ—΄μ—λŠ” clock_timestamp()을 톡해 각 ν–‰μ˜ μ‚½μž… μ‹œκ°„μ„ κΈ°λ‘ν–ˆλ‹€.
  • PostgreSQL μ „μš© ꡬ문인 generate_series()λ₯Ό 톡해 1000개의 행을 μ‚½μž…ν–ˆλ‹€.

μ‹œκ°„λŒ€ 닀루기

μ‹œκ°„λŒ€ μ„€μ • μ°ΎκΈ°

1
2
SHOW timezone;
SELECT current_setting('timezone');
  • μœ„ 두 λͺ…λ Ήμ–΄ 쀑 ν•˜λ‚˜λ₯Ό μ‚¬μš©ν•˜λ©΄ ν˜„μž¬ μ‹œκ°„λŒ€λ₯Ό μ•Œ 수 μžˆλ‹€.

SHOW ALL; λͺ…λ Ήμ–΄λ₯Ό μ‚¬μš©ν•˜λ©΄ PostgreSQL μ„œλ²„μ˜ λͺ¨λ“  맀개 λ³€μˆ˜ 섀정을 μ•Œ 수 μžˆλ‹€.

1
SELECT make_timestamptz(2022, 2, 22, 18, 4, 30.3, current_setting('timezone'));
  • 두 λ¬Έμž₯ λͺ¨λ‘ λ™μΌν•œ 정보λ₯Ό μ œκ³΅ν•˜μ§€λ§Œ, λ‹€λ₯Έ ν•¨μˆ˜μ— λŒ€ν•œ μž…λ ₯μœΌλ‘œλŠ” current_setting()을 μ‚¬μš©ν•˜λŠ” 게 μ’‹λ‹€.
1
2
3
4
5
6
SELECT * FROM pg_timezone_abbrevs ORDER BY abbrev;
SELECT * FROM pg_timezone_names ORDER BY name;

SELECT * FROM pg_timezone_names
WHERE name LIKE 'Europe%'
ORDER BY name;
  • μœ„ 두 λͺ…령문은 μ‹œκ°„λŒ€ 약어와 이름을 좜λ ₯ν•˜λŠ” 쿼리닀.
  • λ§ˆμ§€λ§‰ 쿼리λ₯Ό 톡해 μ•½μ–΄λ₯Ό 필터링 ν•  μˆ˜λ„ μžˆλ‹€.
  • μ‹€ν–‰ κ²°κ³ΌλŠ” λ‹€μŒκ³Ό κ°™λ‹€.
nameabbrevutc_offsetis_dst
Europe/AmsterdamCEST02:00:00true
Europe/AndorraCEST02:00:00true
Europe/Astrakhan+0404:00:00false
Europe/AthensEEST03:00:00true

μ‹œκ°„λŒ€ μ„€μ •ν•˜κΈ°

  • PostgreSQL을 μ„€μΉ˜ν•  λ•Œ μ„œλ²„μ˜ κΈ°λ³Έ μ‹œκ°„λŒ€λŠ” postgresql.confμ—μ„œ 맀개 λ³€μˆ˜λ‘œ μ„€μ •λ˜μ—ˆλ‹€.
  • ν•΄λ‹Ή νŒŒμΌμ„ λ³€κ²½ν•˜λ©΄ λ‹€λ₯Έ μ‚¬μš©μžλ‚˜ μ‘μš© ν”„λ‘œκ·Έλž¨μ— μ˜λ„ν•˜μ§€ μ•Šμ€ κ²°κ³Όκ°€ λ°œμƒν•  수 μžˆμœΌλ―€λ‘œ μ£Όμ˜ν•΄μ•Ό ν•˜λ©°, 이번 μž₯μ—μ„œλŠ” μ„Έμ…˜ λ³„λ‘œ μ‹œκ°„λŒ€λ₯Ό μ„€μ •ν•˜λŠ” 방법을 μ•Œμ•„λ³Ό 것이닀.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SET TIME ZONE 'America/Los_Angeles';

CREATE TABLE time_zone_test (
	test_date timestamptz
);

INSERT INTO time_zone_test VALUES ('2023-01-01 4:00');

SELECT test_date
FROM time_zone_test;

SET TIME ZONE 'America/Indiana/Petersburg';

SELECT test_date
FROM time_zone_test;

SELECT test_date AT TIME ZONE 'Asia/Seoul'
FROM time_zone_test;
test_date
2023-01-01 21:00:00.000 +0900
  • μœ„ κ²°κ³ΌλŠ” America/Los_Angeles, America/Indiana/Petersburg의 κ²°κ³Όλ‹€.
  • UTC보닀 9μ‹œκ°„μ΄ λΉ λ₯΄λ‹€λŠ” μ˜λ―Έλ‹€.
timezone
2023-01-01 21:00:00.000
  • λ§ˆμ§€λ§‰ 쿼리의 κ²°κ³ΌλŠ” μœ„μ™€ κ°™μœΌλ©°, λ‹€μŒκ³Ό 같은 사싀을 μ•Œ 수 μžˆλ‹€.
  1. timestamptzλŠ” μ ˆλŒ€μ  μ‹œκ°μ„ 보μž₯ν•œλ‹€.
  2. νƒ€μž„μ‘΄ 변경은 데이터λ₯Ό λ³€κ²½ν•˜λŠ” 것이 μ•„λ‹ˆλΌ, 좜λ ₯ 및 λ³€ν™˜ μ‹œμ μ—μ„œμ˜ ν‘œν˜„ 방식일 뿐이닀.
  3. 같은 행을 보더라도 μ„Έμ…˜ νƒ€μž„μ‘΄μ— 따라 달리 보일 수 μžˆλ‹€.

λ‚ μ§œ 및 μ‹œκ°„μ„ ν™œμš©ν•˜μ—¬ κ³„μ‚°ν•˜κΈ°

1
2
SELECT '1929-09-30'::date - '1929-09-27'::date;
SELECT '1929-09-30'::date + '5 years'::interval;
  • μœ„ λͺ…λ Ήλ¬Έκ³Ό 같이 datetime 및 interval νƒ€μž…μ— λŒ€ν•΄ κ°„λ‹¨ν•œ μ‚°μˆ μ„ μˆ˜ν–‰ν•  수 μžˆλ‹€.
  • 첫 번째 μΏΌλ¦¬λŠ” 두 λ‚ μ§œκ°€ μ •ν™•νžˆ 3일 λ–¨μ–΄μ Έ μžˆμŒμ„ μ•Œ 수 μžˆλ‹€.
  • 두 번째 μΏΌλ¦¬λŠ” 1934-09-30μ΄λΌλŠ” νƒ€μž„μŠ€νƒ¬ν”„ 값을 λ°˜ν™˜ν•œλ‹€.

λ‰΄μš•μ‹œ νƒμ‹œ λ°μ΄ν„°μ—μ„œ νŒ¨ν„΄ μ°ΎκΈ°

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
CREATE TABLE nyc_yellow_taxi_trips (
	trip_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
	vendor_id text NOT NULL,
	tpep_pickup_datetime timestamptz NOT NULL,
	tpep_dropoff_datetime timestamptz NOT NULL,
	passenger_count integer NOT NULL,
	trip_distance numeric(8,2) NOT NULL,
	pickup_longitude numeric(18,15) NOT NULL,
	pickup_latitude numeric(18,15) NOT NULL,
	rate_code_id text NOT NULL,
	store_and_fwd_flag text NOT NULL,
	dropoff_longitude numeric(18,15) NOT NULL,
	dropoff_latitude numeric(18,15) NOT NULL,
	payment_type text NOT NULL,
	fare_amount numeric(9,2) NOT NULL,
	extra numeric(9,2) NOT NULL,
	mta_tax numeric(5,2) NOT NULL,
	tip_amount numeric(9,2) NOT NULL,
	tolls_amount numeric(9,2) NOT NULL,
	improvement_surcharge numeric(9,2) NOT NULL,
	total_amount numeric(9,2) NOT NULL
);

COPY nyc_yellow_taxi_trips (
	vendor_id,
	tpep_pickup_datetime,
	tpep_dropoff_datetime,
	passenger_count,
	trip_distance,
	pickup_longitude,
	pickup_latitude,
	rate_code_id,
	store_and_fwd_flag,
	dropoff_longitude,
	dropoff_latitude,
	payment_type,
	fare_amount,
	extra,
	mta_tax,
	tip_amount,
	tolls_amount,
	improvement_surcharge,
	total_amount
   )
FROM 'C:\YourDirectory\nyc_yellow_taxi_trips.csv'
WITH (FORMAT CSV, HEADER);

CREATE INDEX tpep_pickup_idx
ON nyc_yellow_taxi_trips (tpep_pickup_datetime);

SELECT count(*) FROM nyc_yellow_taxi_trips;
  • μœ„ λͺ…령문은 ν…Œμ΄λΈ” 생성 및 λ‰΄μš•μ‹œμ˜ λ…Έλž€μƒ‰ νƒμ‹œ 데이터λ₯Ό κ°€μ Έμ˜€λŠ” 쿼리닀.
ν•˜λ£¨ 쀑 κ°€μž₯ λ°”μœ μ‹œκ°„λŒ€
1
2
3
4
5
6
SELECT
	date_part('hour', tpep_pickup_datetime) AS trip_hour,
	count(*)
FROM nyc_yellow_taxi_trips
GROUP BY trip_hour
ORDER BY trip_hour;
  • μ‹œκ°„ 별 νƒμ‹œ 승차 횟수λ₯Ό κ³„μ‚°ν•˜λŠ” 쿼리λ₯Ό 톡해 ν•˜λ£¨ 쀑 κ°€μž₯ λ°”μœ μ‹œκ°„λŒ€λ₯Ό μ•Œμ•„λ³΄μž.
trip_hourcount
017,383
118,031
217,998
319,125
418,053
515,069
618,513
722,689
823,190
923,098
1024,106
1122,554
1217,765
138,182
145,003
153,070
162,275
172,229
183,925
1910,825
2018,287
2121,062
2218,975
2317,367
μ—‘μ…€μ—μ„œ μ‹œκ°ν™”ν•˜κΈ° μœ„ν•΄ CSV둜 내보내기
1
2
3
4
5
6
7
8
9
10
COPY
	(SELECT
		date_part('hour', tpep_pickup_datetime) AS trip_hour,
		count(*)
	FROM nyc_yellow_taxi_trips
	GROUP BY trip_hour
	ORDER BY trip_hour
	)
TO 'C:\YourDirectory\hourly_taxi_pickups.csv'
WITH (FORMAT CSV, HEADER);
  • 데이터λ₯Ό μ—‘μ…€λ‘œ 뢈러온 λ’€ μ„  κ·Έλž˜ν”„λ₯Ό μƒμ„±ν•˜λ©΄ ν•΄λ‹Ή λ‚ μ§œμ˜ νŒ¨ν„΄μ΄ λ”μš± λͺ…ν™•ν•˜λ‹€.

  • λ¬Όλ‘  λ©°μΉ  λ˜λŠ” λͺ‡ 달에 걸친 데이터λ₯Ό 더 깊이 뢄석해야 ν•΄λ‹Ή 데이터λ₯Ό μΌλ°˜ν™”ν•  수 μžˆλ‹€.
  • date_part() ν•¨μˆ˜λ‘œ μš”μΌμ„ μΆ”μΆœν•˜μ—¬ 평일과 주말의 μŠΉμ°¨λŸ‰μ„ 비ꡐ할 μˆ˜λ„ 있고, 일기 예보λ₯Ό ν™•μΈν•˜μ—¬ 날씨에 λ”°λ₯Έ μŠΉμ°¨λŸ‰μ„ 비ꡐ할 μˆ˜λ„ μžˆλ‹€.
νƒμ‹œ 승차 ν›„ 이동 μ‹œκ°„μ€ μ–Έμ œ κ°€μž₯ κΈ΄κ°€μš”?
  • 닡을 μ°ΎλŠ” ν•œ κ°€μ§€ 방법은 각 μ‹œκ°„μ˜ 쀑간 이동 μ‹œκ°„μ„ κ³„μ‚°ν•˜λŠ” 것이닀.
  • 쀑앙 값은 μ •λ ¬λœ κ°’ μ„ΈνŠΈμ˜ 쀑간 κ°’μœΌλ‘œ, ν‰κ· κ³ΌλŠ” λ‹€λ₯΄κ²Œ μ„ΈνŠΈ μ•ˆμ˜ 맀우 μž‘κ±°λ‚˜ 큰 값이 κ²°κ³Όλ₯Ό μ™œκ³‘ν•˜μ§€ μ•ŠκΈ° λ•Œλ¬Έμ— 비ꡐ할 λ•ŒλŠ” 쀑앙 값이 평균보닀 더 μ •ν™•ν•˜λ‹€.
1
2
3
4
5
6
7
8
SELECT
	date_part('hour', tpep_pickup_datetime) AS trip_hour,
	percentile_cont(.5)
		WITHIN GROUP (ORDER BY
			tpep_dropoff_datetime - tpep_pickup_datetime) AS median_trip
FROM nyc_yellow_taxi_trips
GROUP BY trip_hour
ORDER BY trip_hour;
  • percentile_cont()둜 쀑앙 값을 κ³„μ‚°ν•˜κΈ° μœ„ν•΄ tpep_dropoff_datetime - tpep_pickup_datetimeλ₯Ό κ³„μ‚°ν•˜μ˜€κ³ , κ·Έ 값을 쀑앙 κ°’ κ³„μ‚°μ˜ μ •λ ¬ κΈ°μ€€μœΌλ‘œ μ‚Όμ•˜λ‹€.
  • κ²°κ³ΌλŠ” λ‹€μŒκ³Ό κ°™λ‹€.
trip_hourmedian_trip
000:14:20
100:14:49
200:15:00
300:14:35
400:14:43
500:14:42
600:14:15
700:13:19
800:12:25
900:11:46
1000:11:54
1100:11:37
1200:11:14
1300:10:04
1400:09:27
1500:08:59
1600:09:57
1700:10:06
1800:07:37
1900:07:54
2000:10:23
2100:12:28
2200:13:11
2300:13:46

Amtrak λ°μ΄ν„°μ—μ„œ νŒ¨ν„΄ μ°ΎκΈ°

κΈ°μ°¨ 이동 μ‹œκ°„ κ³„μ‚°ν•˜κΈ°
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
CREATE TABLE train_rides (
	trip_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
	segment text NOT NULL,
	departure timestamptz NOT NULL,
	arrival timestamptz NOT NULL
);

INSERT INTO train_rides (segment, departure, arrival)
VALUES
	('Chicago to New York', '2020-11-13 21:30 CST', '2020-11-14 18:23 EST'),
	('New York to New Orleans', '2020-11-15 14:15 EST', '2020-11-16 19:32 CST'),
	('New Orleans to Los Angeles', '2020-11-17 13:45 CST', '2020-11-18 9:00 PST'),
	('Los Angeles to San Francisco', '2020-11-19 10:10 PST', '2020-11-19 21:24 PST'),
	('San Francisco to Denver', '2020-11-20 9:10 PST', '2020-11-21 18:38 MST'),
	('Denver to Chicago', '2020-11-22 19:10 MST', '2020-11-23 14:50 CST');

SELECT * FROM train_rides;
  • κΈ°μ°¨ 이동 데이터에 λŒ€ν•œ ν…Œμ΄λΈ”κ³Ό 행을 μΆ”κ°€ν•˜μ˜€κ³ , κ²°κ³ΌλŠ” λ‹€μŒκ³Ό κ°™λ‹€.
trip_idsegmentdeparturearrival
1Chicago to New York2020-11-14 12:30:00.000 +09002020-11-15 08:23:00.000 +0900
2New York to New Orleans2020-11-16 04:15:00.000 +09002020-11-17 10:32:00.000 +0900
3New Orleans to Los Angeles2020-11-18 04:45:00.000 +09002020-11-19 02:00:00.000 +0900
4Los Angeles to San Francisco2020-11-20 03:10:00.000 +09002020-11-20 14:24:00.000 +0900
5San Francisco to Denver2020-11-21 02:10:00.000 +09002020-11-22 10:38:00.000 +0900
6Denver to Chicago2020-11-23 11:10:00.000 +09002020-11-24 05:50:00.000 +0900
1
2
3
4
SELECT segment,
	to_char(departure, 'YYYY-MM-DD HH12:MI a.m. TZ') AS departure,
	arrival - departure AS segment_duration
FROM train_rides;
  • μœ„ λͺ…령문은 각 이동 κ΅¬κ°„μ˜ 길이λ₯Ό κ³„μ‚°ν•˜λŠ” 쿼리닀.
  • 이 μΏΌλ¦¬λŠ” μ—¬ν–‰ ꡬ간, 좜발 μ‹œκ°„, μ—¬ν–‰ 기간을 λ‚˜μ—΄ν•œλ‹€.
  • departure μ—΄μ˜ to_char() ν•¨μˆ˜λŠ” νƒ€μž„μŠ€νƒ¬ν”„ κ°’ YYYY-MM-DD HH12:MI a.m. TZ ν˜•μ‹μ˜ λ¬Έμžμ—΄λ‘œ λ³€ν™˜ν•œλ‹€.
  • HH12 뢀뢄은 24μ‹œκ°„ ꡰ사 μ‹œκ°„μ΄ μ•„λ‹Œ 12μ‹œκ°„ μ‹œκ³„λ₯Ό μ‚¬μš©ν•˜λ„λ‘ μ§€μ •ν•œλ‹€.
  • a.m. 뢀뢄은 λ§ˆμΉ¨ν‘œλ‘œ κ΅¬λΆ„λœ μ†Œλ¬Έμžλ₯Ό μ΄μš©ν•˜μ—¬ μ˜€μ „ λ˜λŠ” μ˜€ν›„ μ‹œκ°„μ„ ν‘œμ‹œν•˜λ„λ‘ μ§€μ •ν•˜κ³ , TZ 뢀뢄은 μ‹œκ°„λŒ€λ₯Ό λ‚˜νƒ€λ‚Έλ‹€.
  • λ§ˆμ§€λ§‰μœΌλ‘œ arriveμ—μ„œ departureλ₯Ό λΉΌμ„œ segment_time ꡬ간을 ν™•μΈν•œλ‹€.
  • κ²°κ³ΌλŠ” λ‹€μŒκ³Ό κ°™λ‹€.
segmentdeparturesegment_duration
Chicago to New York2020-11-14 12:30 p.m. KST19:53:00
New York to New Orleans2020-11-16 04:15 a.m. KST1 day 06:17:00
New Orleans to Los Angeles2020-11-18 04:45 a.m. KST21:15:00
Los Angeles to San Francisco2020-11-20 03:10 a.m. KST11:14:00
San Francisco to Denver2020-11-21 02:10 a.m. KST1 day 08:28:00
Denver to Chicago2020-11-23 11:10 a.m. KST18:40:00
  • μœ„ 결과와 같이 ν•œ νƒ€μž„μŠ€νƒ¬ν”„μ—μ„œ λ‹€λ₯Έ νƒ€μž„μŠ€νƒ¬ν”„λ₯Ό λΉΌλ©΄ interval 데이터 νƒ€μž…μ΄ μƒμ„±λœλ‹€.
λˆ„μ  이동 μ‹œκ°„ κ³„μ‚°ν•˜κΈ°
1
2
3
4
SELECT segment,
	arrival - departure AS segment_duration,
	sum(arrival - departure) OVER (ORDER BY trip_id) AS cume_duration
FROM train_rides;
  • OVER (ORDER BY ...) ꡬ문을 톡해 λˆ„μ  합을 μΏΌλ¦¬ν–ˆλ‹€.
segmentsegment_durationcume_duration
Chicago to New York19:53:0019:53:00
New York to New Orleans1 day 06:17:001 day 26:10:00
New Orleans to Los Angeles21:15:001 day 47:25:00
Los Angeles to San Francisco11:14:001 day 58:39:00
San Francisco to Denver1 day 08:28:002 days 67:07:00
Denver to Chicago18:40:002 days 85:47:00
  • 전체 이동 μ‹œκ°„μ„ κ³„μ‚°ν•˜μ˜€λŠ”λ°, κ²°κ³Ό λˆ„κ³„λŠ” μ •ν™•ν•˜μ§€λ§Œ μœ μš©ν•˜μ§€ μ•Šμ€ ν˜•μ‹μœΌλ‘œ 좜λ ₯λ˜μ—ˆμŒμ„ μ•Œ 수 μžˆλ‹€.
  • μ΄λŠ” PostgreSQL이 일 뢀뢄에 λŒ€ν•œ 합계와 μ‹œκ°„ 뢀뢄에 λŒ€ν•œ 합계λ₯Ό λ”°λ‘œ 생성함을 μ•Œ 수 μžˆλ‹€.
  • μ΄λŸ¬ν•œ μ œν•œμ„ μš°νšŒν•˜κΈ° μœ„ν•΄ λ‹€μŒκ³Ό 같이 쿼리할 수 μžˆλ‹€.
1
2
3
4
5
SELECT segment,
	arrival - departure AS segment_duration,
	justify_interval(sum(arrival - departure)
		OVER (ORDER BY trip_id)) AS cume_duration
FROM train_rides;
segmentsegment_durationcume_duration
Chicago to New York19:53:0019:53:00
New York to New Orleans1 day 06:17:002 days 02:10:00
New Orleans to Los Angeles21:15:002 days 23:25:00
Los Angeles to San Francisco11:14:003 days 10:39:00
San Francisco to Denver1 day 08:28:004 days 19:07:00
Denver to Chicago18:40:005 days 13:47:00
  • justify_interval() ν•¨μˆ˜λŠ” 24μ‹œκ°„μ€ 일둜, 30일은 μ›”λ‘œ λ³€ν™˜ν•˜λ„λ‘ 간격 κ³„μ‚°μ˜ 좜λ ₯을 ν‘œμ€€ν™”ν•œλ‹€.
  • 이λ₯Ό 톡해 좜λ ₯이 λ”μš± μ΄ν•΄ν•˜κΈ° μ‰¬μ›Œμ‘Œλ‹€.
This post is licensed under CC BY 4.0 by the author.